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Abstract 


An  old  and  celebrated  analogy  says  that  writing  programs  is  like  proving  the¬ 
orems.  This  analogy  has  been  productive  in  both  directions,  but  in  particular  has 
demonstrated  remarkable  utility  in  driving  progress  in  programming  languages,  for 
example  leading  towards  a  better  understanding  of  concepts  such  as  abstract  data 
types  and  polymorphism.  One  of  the  best  known  instances  of  the  analogy  actu¬ 
ally  rises  to  the  level  of  an  isomorphism:  between  Gentzen's  natural  deduction  and 
Church's  lambda  calculus.  However,  as  has  been  recognized  for  a  while,  lambda 
calculus  fails  to  capture  some  of  the  important  features  of  modern  programming 
languages.  Notably,  it  does  not  have  an  inherent  notion  of  evaluation  order,  needed 
to  make  sense  of  programs  with  side  effects.  Instead,  the  historical  descendents  of 
lambda  calculus  (languages  like  Lisp,  ML,  Haskell,  etc.)  impose  evaluation  order  in 
an  ad  hoc  way. 

This  thesis  aims  to  give  a  fresh  take  on  the  proofs-as-programs  analogy — one 
which  better  accounts  for  features  of  modern  programming  languages — by  starting 
from  a  different  logical  foundation.  Inspired  by  Andreoli's  focusing  proofs  for  lin¬ 
ear  logic,  we  explain  how  to  axiomatize  certain  canonical  forms  of  logical  reasoning 
through  a  notion  of  pattern.  Propositions  come  with  an  intrinsic  polarity,  based  on 
whether  they  are  defined  by  patterns  of  proof,  or  by  patterns  of  refutation.  Ap¬ 
plying  the  analogy,  we  then  obtain  a  programming  language  with  built-in  support 
for  pattern-matching,  in  which  evaluation  order  is  explicitly  reflected  at  the  level  of 
types — and  hence  can  be  controlled  locally,  rather  than  being  an  ad  hoc,  global  policy 
decision.  As  we  show,  different  forms  of  continuation-passing  style  (one  of  the  his¬ 
torical  tools  for  analyzing  evaluation  order)  can  be  described  in  terms  of  different  po¬ 
larizations.  This  language  provides  an  elegant,  uniform  account  of  both  untyped  and 
intrinsically-typed  computation  (incorporating  ideas  from  infinitary  proof  theory), 
and  additionally,  can  be  provided  an  extrinsic  type  system  to  express  and  statically 
enforce  more  refined  properties  of  programs.  We  conclude  by  using  this  framework 
to  explore  the  theory  of  typing  and  subtyping  for  intersection  and  union  types  in  the 
presence  of  effects,  giving  a  simplified  explanation  of  some  of  the  unusual  artifacts 
of  existing  systems. 
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NEN1  HltO 


I  would  take  these  six  weeks  of  relative  solitude  and  give  this  new  thing,  still  in  a 
file  called  X,  a  chance  to  grow.  If  nothing  came  of  it,  I  would  go  back  to  Fountain 
City,  having  wasted  only  a  month  and  a  half.  What  was  a  month  and  a  half  out  of 
five  years? 

The  new  book  seemed  to  want  to  take  place  in  Pittsburgh,  and  thus,  in  my  basement 
room,  I  returned  to  the  true  fountain  city,  the  mysterious  source  of  so  many  of  my 
ideas.  I  didn't  stop  to  think  about  what  I  was  doing,  whom  it  would  interest,  what 
my  publisher  and  the  critics  would  think  of  it,  and,  sweetest  of  all,  I  didn't  give  a 
single  thought  to  what  I  was  trying  to  say.  I  just  wrote. 

— Michael  Chabon,  "Diving  into  the  Wreck",  Maps  &  Legends 
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Chapter  1 

Introduction 


A  good  analogy  is  like  a  diagonal  frog. 

— Kai  Krause 

This  thesis  revisits  the  old  analogy  between  proving  and  programming:  a  proof  is  like  a 
program,  a  program  is  like  a  proof.  Perhaps  the  reason  why  this  analogy — known  variously  as 
proofs-as-programs,  propositions-as-types,  or  the  Curry-Howard  correspondence — has  remained  fresh 
after  several  decades  is  because  it  is  inspiring  in  both  directions.  Computer  scientists  can  tell 
themselves,  "I'm  not  just  wasting  my  time  writing  programs  all  day.  I'm  proving  theorems!" 
And  mathematicians  can  tell  themselves,  "I'm  not  just  wasting  time  proving  theorems  all  day. 
I'm  writing  programs!" 

But  not  only  in  matters  of  self-esteem,  the  proofs-as-programs  analogy  really  has  demon¬ 
strated  remarkable  utility  in  driving  progress  in  programming  languages.  Over  the  past  few 
decades,  many  different  ideas  from  logic  have  permeated  into  the  field,  reinterpreted  from  a 
computational  perspective  and  collectively  organized  as  type  theory.  In  the  1980s,  type  theory 
dramatically  improved  the  theoretical  understanding  of  difficult  language  concepts  such  as  ab¬ 
stract  data  types  and  polymorphism,  and  led  directly  to  the  development  of  groundbreaking 
new  languages  such  as  ML  and  Haskell.  In  the  other  direction,  type  theory  has  also  been  ap¬ 
plied  back  towards  the  mechanization  of  mathematics,  and  the  Curry-Howard  correspondence 
forms  the  basis  for  successful  proof  assistants  such  as  Coq  and  Agda.  Not  least,  the  analogy 
between  proving  and  programming  has  the  social  effect  of  linking  two  different  communities  of 
researchers:  although  people  who  write/study  proofs  and  people  who  write/study  programs 
often  have  very  different  motivations,  the  Curry-Howard  correspondence  says  that  in  some  ways 
they  are  doing  very  similar  things. 

Yet,  how  reasonable  is  the  analogy?  For  certain  formal  notions  of  "proof"  and  "program",  it 
goes  beyond  the  level  of  an  analogy  to  an  isomorphism.  The  best-known  example  is  the  isomor¬ 
phism  between  Gentzen's  natural  deduction  and  Church's  simply-typed  lambda  calculus,  for¬ 
malized  by  William  Howard  in  1969.  Howard's  observation  was  influential  not  only  in  providing 
support  for  the  analogy,  but  also  in  elevating  the  status  of  the  formalisms  themselves — after  all, 
the  same  mathematical  structure  was  observed  arising  independently  in  separate  contexts.  But 
then  again,  did  either  of  these  separate  formalisms  really  capture  the  properties  of  proofs  or 
programs  "as  God  intended"?  Or  at  least  (what  may  not  be  exactly  the  same  thing)  as  they 
occur  in  practice? 
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In  fact,  it  has  been  recognized  for  a  long  time  that  lambda  calculus  does  not  quite  match  up 
with  even  its  historical  descendents,  languages  like  LISP,  ML  and  Haskell.  John  Reynolds  wrote 
in  1972, 

Purely  applicative  languages  are  often  said  to  be  based  on  a  logical  system  called 
the  lambda  calculus,  or  even  to  be  "syntactically  sugared"  versions  of  the  lambda 
calculus. . .  However,  as  we  will  see,  although  an  unsugared  applicative  language  is 
syntactically  equivalent  to  the  lambda  calculus,  there  is  a  subtle  semantic  difference. 

This  "subtle  semantic  difference"  Reynolds  was  describing  is  known  as  evaluation  order.  If  all  a 
program  ever  does  is  take  an  input,  compute  some  mathematical  function  on  it,  and  return  an 
answer,  then  the  order  in  which  it  performs  different  operations  may  be  important  for  matters 
of  efficiency,  but  not  for  the  ultimate  result.  However,  in  most  real  programming  languages, 
that  is  not  all  that  a  program  can  do.  Different  subroutines  may  never  return  an  answer,  looping 
endlessly.  Or  they  might  find  something  wrong  with  the  input  and  raise  an  exception,  or  worse, 
cause  a  system  crash;  or  they  might  print  to  the  screen,  or  prompt  the  user  for  input;  or  read 
and  write  locations  from  memory  or  disk;  or  transfer  control  to  another  thread. . . .  The  way 
in  which  these  possible  effects  are  staged  almost  always  affects  the  behavior  of  the  program, 
and  the  programmer  probably  had  a  certain  order  in  mind.  That  is  why  most  programming 
languages  specify  a  (mostly)  deterministic  order  of  evaluation.  But  this  is  something  imposed 
on  lambda-calculus  in  an  ad  hoc  way,  and  indeed  there  is  no  canonical  choice.  For  example, 
LISP  and  ML  adopt  different  strategies  from  Haskell. 

The  fact  that  lambda-calculus  has  this  subtle  but  real  and  important  difference  with  modern 
programming  languages  may  seem  to  place  skepticism  on  the  original  analogy  between  programs 
and  proofs.  Indeed,  side-effects  seem  to  precisely  reflect  the  ways  in  which  real  programs  are 
unlike  proofs.  How  does  a  proof  raise  an  exception,  or  print  to  the  screen? 

But  evaluation  order  is  not  the  only  way  in  which  the  lambda-calculus  differs  from  modern 
functional  programming  language.  Church's  original  formulation  was  fundamentally  about 
functions.  "Everything  is  a  function",  as  the  slogan  goes.  But  while  it  is  possible  (as  Church 
showed)  to  encode  any  datatype  in  the  pure  lambda-calculus,  and  while  these  encodings  are 
elegant  and  insightful — they  aren't  particularly  natural.  Hardly  anyone  would  think  of  using, 
say,  the  Church-encoding  of  binary  trees  on  their  first  pass  at  writing  a  search  routine.  (Perhaps 
they  will  on  a  second  or  third  pass,  if  the  routine  has  some  subtle  control  flow  they  can't 
capture  with  a  more  basic  data  structure.)  One  of  the  simple  but  enormously  practical  features 
of  languages  like  ML  and  Haskell  is  the  ability  to  use  algebraic  data  types  and  define  functions 
on  them  by  pattern-matching.  This  feature  often  enables  programmers  to  understand  the  behavior 
of  their  subroutines  using  completely  first-order  equational  reasoning,  without  necessarily  having 
to  reason  about  higher-order  functions  (and  the  potential  side-effects  those  functions  may  incur). 

In  contrast  to  evaluation  order  and  side-effects,  pattern-matching  is  something  borrowed 
directly  from  mathematical  practice,  where  it  is  used  both  as  a  notation  for  defining  functions, 
and  as  a  notation  for  writing  proofs  by  case-analysis.  And  so  it  reveals  a  deficiency  in  the  other 
half  of  the  Curry-Howard  isomorphism:  the  fact  that  Gentzen's  natural  deduction  lacks  pattern¬ 
matching  facilities  makes  it  not  so  natural.  Where  an  ordinary  mathematical  proof  might  just 
list  a  set  of  different  cases  without  extra  justification,  in  natural  deduction  (and  its  close  relative, 
sequent  calculus)  the  single  case-analysis  is  converted  into  a  series  of  steps,  breaking  down  each 
binary  conjunction  and  disjunction  individually,  ultimately  arriving  at  the  same  thing. 


2 


It  may  be  unsurprising  that  performing  all  of  these  microscopic  steps  is  not  just  tedious,  but 
also  inefficient  if  we  want  to  discover  proofs.  The  proof  search  community  has  therefore  devised 
many  different  strategies  for  making  larger  (but  still  sound)  inferences  when  looking  for  proofs, 
to  reduce  this  inefficiency  A  fundamental  breakthrough  was  made  by  Andreoli  [1992],  a  proof 
search  strategy  he  called  focusing  for  Girard's  linear  logic  [1987],  Andreoli's  observation  was 
that  in  the  context  of  sequent  calculus  proof  search,  the  linear  logic  connectives  exhibit  a  natural 
duality  Depending  on  which  side  of  the  sequent  a  formula  A  ft  B  is  placed  (i.e.,  whether  it 
is  assumed  to  be  true  or  false),  the  connective  ft  will  either  behave  "synchronously"  or  "asyn¬ 
chronously".  Precisely  half  of  the  connectives  of  linear  logic — ®,  ©,  1,  0,  and  !,  which  Andreoli 
called  synchronous — are  invertible  on  the  left  side  of  the  sequent  (i.e.,  they  can  be  decomposed 
eagerly),  and  non-invertible  on  the  right.  Whereas  for  the  other  half — &,  _L,  T,  and  ?,  which 
Andreoli  called  asynchronous — the  opposite  is  true.  Andreoli  used  this  observation  to  build  an 
efficient  proof  search  strategy  for  linear  logic,  alternating  between  dual  phases  he  called  inversion 
and  focus.  Girard  [1993]  followed  up  on  Andreoli's  idea,  and  termed  this  choice  of  bias  towards 
the  right  or  to  the  left  the  polarity  of  a  connective.  Girard's  insight  was  that  polarity  was  a 
very  widespread  phenomenon,  observable  not  only  in  linear  logic,  but  also  in  intuitionistic  and 
constructive  versions  of  classical  logic.  While  the  connectives  of  classical  logic  appear  to  have  no 
bias  towards  proof  or  refutation,  they  can  be  explicitly  polarized,  positively  or  negatively,  which 
endows  them  with  different  constructive  content. 

The  central  claim  of  this  thesis  is  that 

focusing  proofs  provide  a  logical  account  of  evaluation  order  and  pattern-matching 

This  claim  obviously  has  at  least  two  parts,  one  about  evaluation  order  and  one  about  pattern¬ 
matching.  The  first  claim  can  be  stated  more  specifically: 

evaluation  order  is  determined  by  polarity 

Because  polarity  is  a  property  of  particular  connectives,  rather  than  of  an  entire  logic,  this  in 
turn  has  the  implication  that  one  language  can  (should)  mix  different  evaluation  strategies,  by  reflecting 
them  at  the  level  of  types.  The  second  claim,  more  specifically,  is  that 

pattern-matching  is  justified  by  polarity 

which  in  turn  has  an  implication  that  pattern-matching  is  not  just  " syntactic  sugar " — it  can  (should) 
be  dealt  with  directly  in  type  theory.  Most  of  these  subclaims  have  already  appeared  in  some  form 
or  another  prior  to  this  work.  But  the  point  is  that  pattern-matching  and  evaluation  order  really 
are  two  sides  of  the  same  coin.  And  focusing,  I  will  argue,  provides  the  right  concepts  for 
treating  evaluation  order  and  pattern-matching  in  a  unified  way,  and  understanding  the  duality 
between  them. 

While  focusing  was  discovered  in  the  early  1990s,  many  of  the  ideas  behind  it  are  much  older. 
One  illustration  is  the  various  attempts  made  in  the  1970s — by  Dag  Prawitz,  Michael  Dummett, 
Per  Martin-Lof,  and  others — to  provide  philosophical  interpretations  that  would  in  some  sense 
"justify"  the  logical  laws.  In  particular,  Dummett  explored  the  idea  that  the  laws  of  natural 
deduction  could  be  justified  by  alternate  "verificationist"  or  "pragmatist  meaning-theories".1  The 
idea,  essentially,  was  that  either  the  introduction  rules  or  the  elimination  rules  for  a  connective 
could  be  taken  as  constituting  its  definition.  Taking  the  first  view — the  verificationist  one — an 

1In  the  1976  William  lames  Lectures,  later  published  as  The  Logical  Basis  of  Metaphysics  [Dummett,  1991]. 
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elimination  rule  can  then  be  justified,  by  showing  that  a  verification  (i.e.,  introduction)  of  its 
premises  already  implies  its  conclusion.  But  taking  the  pragmatist  view,  it  is  the  introduction 
rules  which  are  justified  from  the  elimination  rules,  by  showing  that  any  consequence  (i.e., 
elimination)  of  the  conclusion  could  have  already  been  derived  from  the  premises. 

Dummett's  analysis  can  be  displayed  graphically  as  a  sort  of  Aristotelian  "square  of  opposi¬ 
tion"  of  assertions  about  a  proposition  A: 


The  verificationist  interpretation  of  A  deals  in  the  top  half  of  the  square:  one  explains  the 
meaning  of  A  in  terms  of  its  direct  proofs,  and  then  one  is  justified  in  deriving  consequences 
from  A  by  case-analysis  over  its  direct  proofs.  The  pragmatist  interpretation  of  A  deals  in  the 
bottom  half. 

So  Dummett's  investigation  already  contained  a  hint  of  polarity,  and  of  the  duality  between 
focus  and  inversion.  But  Dummett  insisted  on  a  requirement  of  harmony  between  the  two  ap¬ 
proaches.  What  was  missing  from  his  account  was  the  possibility  of  simply  accepting  diversity — 
that  the  different  meaning-theories  actually  define  different  connectives.  That  was  the  insight  of 
linear  logic,  which  showed  we  could  distinguish  a  plethora  of  connectives — two  conjunctions  (<g> 
and  &),  two  disjunctions  (©  and  ©),  etc. — and  relate  them  by  modalities.  And  it  accords  with 
our  operational  intuitions  from  programming  languages,  that  for  example  strict  products  and 
lazy  products  really  are  different  things,  and  we  would  sometimes  like  to  be  able  to  speak  about 
both  within  the  same  language.  Our  aim  is  thus  to  build  a  polarized  type  theory,  where  we  can 
say  what  we  mean. 

The  framework  that  is  probably  most  closely  related  to  the  one  developed  here  is  Paul  Levy's 
call-by-push- value  [Levy,  2001,  2004],  CBPV  also  maintains  a  distinction  between  two  different 
kinds  of  types,  which  Levy  calls  value  types  and  computation  types,  and  these  correspond,  more 
or  less,  to  positive  types  and  negative  types  as  I  use  them.  However,  the  language  I  present 
here  was  developed  independently  of  CBPV.  The  CBPV  paradigm  arose  strictly  out  of  semantic 
concerns,  by  trying  to  unify  denotational  models  of  call-by-value  and  call-by-name,  without  any 
dependence  on  or  connection  to  proof  theory.  In  contrast,  the  framework  developed  here  arose 
strictly  out  of  proof-theoretic  concerns,  by  trying  to  understand  the  effect  of  evaluation  order  on 
subtyping.  The  two  alternate  accounts  thus  give  each  other  independent  confirmation.  On  the 
other  hand,  I  also  think  that  the  special  emphasis  on  proof  theory  in  this  thesis  helps  us  to  gain 
some  new  ground  not  already  covered  by  CBPV,  particularly  with  respect  to  refinement  types 
(e.g.,  intersection  and  union  types)  and  subtyping. 

The  remaining  chapters  of  the  thesis  are  organized  as  follows: 

Chapter  2  (Canonical  derivations).  We  introduce  the  underlying  logical  objects  which  will 
later  be  given  a  computational  interpretation.  We  develop  a  generic  account  of  proof  and  refuta¬ 
tion  for  polarized  propositions — based  on  a  primitive  notion  of  pattern — in  the  form  of  an  iterated 
inductive  definition.  We  show  that  these  derivations  satisfy  suitable  principles  of  identity  and 
composition,  and  describe  how  they  induce  different  notions  of  entailment. 

Chapter  3  (Focusing  proofs  and  double-negation  translations).  We  explain  in  what  sense 
the  canonical  derivations  of  Chapter  2  correspond  to  Andreoli's  focusing  proofs.  We  exploit  this 
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correspondence  to  better  understand  the  relationship  between  classical  logic,  polarized  logic,  and 
minimal  logic,  showing  how  to  reconstruct  different  double-negation  translations  of  classical  into 
minimal  logic  (Glivenko,  Godel-Gentzen,  etc.)  by  factoring  them  through  polarized  logic. 

Chapter  4  (Proof  as  programs).  We  reinterpret  the  proof  objects  of  Chapter  2  through 
the  lens  of  Curry-Howard,  and  show  that  this  gives  rise  to  phenomena  familiar  from  the  opera¬ 
tional  semantics  of  programming  languages,  notably  pattern-matching  and  continuation-passing- 
style.  We  give  an  intrinsic  definition  of  a  programming  language — terms  are  equated  with  log¬ 
ical  derivations — but  develop  a  type-free  notation  for  terms,  with  an  equational  theory  and  an 
environment-based  operational  semantics.  We  define  observation  equivalence,  and  show  that  it 
coincides  with  syntactic  equality  in  the  presence  of  sufficient  effects.  We  study  programming 
with  mixed  polarity  types,  and  explain  how  this  gives  a  rich  language  for  describing  mixed  eval¬ 
uation  order.  Finally,  we  convert  the  results  about  double-negation  translations  in  Chapter  3  to 
the  computational  setting,  and  show  how  to  reconstruct  different  CPS  translations  of  A-calculus 
(call-by-value,  call-by-name,  etc.)  by  way  of  polarization. 

Chapter  5  (Concrete  notations  for  abstract  derivations).  We  describe  embeddings  of  the 
language  C+  defined  in  Chapter  4  into  two  existing  logical  frameworks  based  on  dependent  type 
theory,  Agda  and  Twelf.  The  two  frameworks  have  different  proof-theoretic  strength,  which  helps 
us  to  better  understand  the  features  of  C+  by  compiling  them  down  to  lower-level  primitives.  In 
particular,  the  Twelf  embedding  employs  a  novel  use  of  defunctionalization  to  compile  pattern¬ 
matching. 

Chapter  6  (Refinement  types  and  completeness  of  subtyping).  We  develop  an  extrinsic  view 
of  polarized  type  theory,  allowing  more  precise  properties  of  terms  to  be  specified  through  more 
refined  types.  One  of  our  motivations  is  to  better  understand  operationally-sensitive  artifacts  in 
historical  type  systems,  giving  an  explanation  for  why,  e.g.,  intersection  types  require  a  value 
restriction  in  ML.  We  give  two  interpretations  of  subtyping — one  demanding  an  explicit  witness 
to  the  safety  of  a  subtyping  relationship,  one  asking  only  for  the  absence  of  counterexamples — 
and  study  the  relationship  between  them.  We  show  that  the  two  forms  of  subtyping  coincide  in 
the  presence  of  sufficient  effects. 

Chapter  7  (Conclusion).  We  summarize  the  contributions  of  the  thesis,  and  discuss  some 
paths  for  future  work. 
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Chapter  2 

Canonical  derivations 


Once  we  have  understood  how  to  discover  individual  patterns  which  are  alive,  we 
may  then  make  a  language  for  ourselves,  for  any  building  task  we  face. 

— Christopher  Alexander,  The  Tuneless  Way  of  Building 

In  this  chapter,  I  introduce  the  basic  proof  objects  which  will  later  be  given  a  proofs-as- 
programs  interpretation.  I  call  them  canonical  derivations,  since  they  enumerate  canonical  forms 
of  proof  and  refutation.  As  I  will  explain  in  the  next  chapter,  canonical  derivations  can  also 
be  seen  as  an  alternate  presentation  of  Andreoli's  focusing  proofs,  for  classical  propositional 
logic.  However,  I  would  rather  not  start  out  by  explaining  them  that  way,  because  the  connec¬ 
tion  to  classical  logic  in  the  classical  sense  is  actually  only  indirect,  a  kind  of  double-negation 
translation — and  in  fact,  canonical  derivations  have  a  stronger  connection  to  minimal  (and  "co- 
minimal")  logic.  My  primary  aim  in  this  chapter  is  to  describe  the  structure  of  canonical  deriva¬ 
tions,  and  only  secondarily  to  study  an  entailment  relation  induced  by  them  (we  will  see  in 
§2.3.3  that  there  are  actually  two  distinct  but  equally  legitimate  ways  of  defining  entailment). 

Our  subject  is  polarized  logic.  That  is,  logic  in  which  every  proposition  has  a  definite  polarity, 
positive  or  negative.  As  I  alluded  to  in  the  Introduction,  one  way  of  understanding  polarity — in 
the  spirit  of  the  "meaning  explanations"  of  the  '70s  put  forth  by  Prawitz,  Dummett,  and  Martin- 
Lof — is  that  positive  propositions  are  "defined"  by  their  form  of  introduction,  and  negative 
propositions  by  their  form  of  elimination.  In  Dummett's  sense,  positive  propositions  have  a 
"verificationist  meaning-theory",  and  negative  propositions  a  "pragmatist  meaning-theory".  The 
primary  contribution  of  this  chapter  is  a  system  of  logical  inference  that  makes  this  intuition 
formal  through  a  notion  of  pattern.  A  pattern  is  basically  a  derivation  with  holes.  The  key 
intuition,  which  is  actually  a  formal  property  of  polarized  logic,  is  that: 

positive  connectives  are  defined  by  their  proof  patterns 

negative  connectives  are  defined  by  their  refutation  patterns 

These  statements  are  meant  literally.  That  is,  we  will  define  canonical  derivations  in  two  stages: 
first  we  define  individual  connectives  by  describing  their  proof  patterns  (for  positive  connectives) 
or  refutation  patterns  (for  negative  connectives),  and  then  we  give  generic  rules  for  full  proof 
and  refutation,  described  in  terms  of  patterns. 

To  get  this  project  off  the  ground,  we  will  have  to  pay  careful  attention  to  the  distinction 
between  propositions  and  judgments  [Martin-Lof,  1996].  To  be  clear,  we  are  interested  in  two 
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different  forms  of  judgment  about  a  polarized  proposition,  on  equal  footing:  both  its  assertion, 
A  true,  and  its  denial,  A  false.  For  notational  concision,  we  will  simply  write  the  former  as  A 
and  the  latter  as  »A,  but  it  is  important  in  mind  to  keep  in  mind  that  these  denote  judgments, 
not  propositions. 

We  will  treat  proof  and  refutation  in  a  completely  symmetric  way — or  rather,  we  will  treat 
them  asymmetrically  in  one  way  for  positive  polarity,  and  then  treat  them  asymmetrically  in 
precisely  the  opposite  way  for  negative  polarity  This  is  a  matter  of  expediency,  because  it  lets 
us  understand  the  basic  idea  of  polarity  without  being  overburdened  by  too  many  distinctions. 
On  the  other  hand,  it  is  a  bit  simplistic.  General  intuitionistic  implication,  for  example,  cannot 
be  expressed  in  this  framework — it  is  naturally  defined  as  a  negative  connective,  but  in  terms  of 
patterns  for  deriving  arbitrary  consequences,  rather  than  in  terms  of  refutation  patterns.  However, 
I  think  this  framework  has  enough  asymmetry  to  be  interesting,  and  that  it  serves  as  a  good 
foundation  from  which  to  study  more  sophisticated  uses  of  polarity,  by  careful  generalization 
and  symmetry-breaking. 

2.1  A  proof-biased  logic 

To  start,  in  this  section  I  will  restrict  to  only  positive  connectives,  showing  how  to  define  patterns, 
proofs  and  refutations,  and  how  to  establish  the  identity  and  composition  principles.  After 
this  initial  setup,  identifying  a  negative  fragment — and  then  generalizing  to  a  unified  polarized 
logic — will  be  relatively  easy.  As  explained  in  the  Introduction,  I  will  need  different  symbols  to 
distinguish  the  polarized  connectives — the  notation  below  is  mostly  borrowed  from  linear  logic. 

2.1.1  Refutation  frames,  proof  patterns,  connectives 

In  this  section  I  use  the  letters  A,  B ,  C  to  range  over  positive  propositions. 

Definition  2.1.1  (Frames  of  refutation  holes).  A  frame  A  is  a  list  »Ai, . . . ,  »An  of  refutation  holes. 

We  write  ■ for  the  empty  frame,  and  Ai,  A2  for  the  concatenation  of  two  frames.  We  write  A  e  A'  for 
list  containment,  i.e.,  the  following  inductively  defined  relation: 

AeAi  A  e  A2 
A  e  A  A  e  Ai,  A2  A  g  Ai,  A2 

Now,  we  inductively  define  a  judgment  A  IF  B  relating  frames  and  positive  propositions.  In¬ 
tuitively,  A  IF  B  says  that  it  is  possible  to  derive  B  directly  from  the  premises  A.  Or  in  other 
words,  a  derivation  of  A  IF  B  gives  the  outline  of  a  proof,  leaving  holes  for  refutations.  For 
example,  we  define  conjunction  and  truth  (0-ary  conjunction)  as  follows: 

A\  IF  A  A2  IF  B 
•  IF  1  Ai,  A2  \\-  A®  B 

Intuitively,  these  definitions  express  that  a  proof  of  truth  requires  no  premises,  while  a  proof 
of  the  conjunction  A®  B  requires  proofs  of  A  and  B,  and  combines  their  respective  premises. 
Likewise,  we  define  disjunction  and  falsehood  (0-ary  disjunction)  with  the  following  rules: 

A  IF  A  A  IF  B 

(no  rule  for  0)  A  IF  A  0  B  A  IF  A  0  B 
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•  lb  1 


Ai  lb  A  A2  lb  B 
Ai,  A2  lb  A  (Si  B 


A  lb  A  Alb  B 

(no  rule  for  0)  A  lb  A  ©  B  Alb  A®  B 


•AW  ^  A 


Figure  2.1:  Definition  of  some  positive  connectives  by  proof  patterns 


Intuitively,  these  definitions  express  that  there  is  no  proof  of  falsehood,  while  a  proof  of  the 
disjunction  A  ©  B  requires  either  a  proof  of  A  or  a  proof  of  B.  Finally,  negation  is  defined  as 
follows: 


•,41b  A  A 

This  definition  obviously  doesn't  express  very  much:  just  that  a  proof  of  ^  A  requires  a  refutation 
of  A.  We  use  the  notation  to  mark  this  negation  as  having  positive  polarity,  as  opposed  to 
the  negative  polarity  negation  ^  defined  further  below.  When  the  polarity  is  clear  from  context, 
however,  we  sometimes  simply  write  -<A. 

Again,  I  take  the  above  rules  as  literally  a  definition  of  the  connectives.  For  quick  reference, 
this  definition  is  displayed  in  Figure  2.1. 

Definition  2.1.2  (Proof  patterns).  A  derivation  of  A  lb  A  is  called  a  proof  pattern,  or  more  specifically 
an  A-pattern.  We  refer  to  the  frame  A  as  the  frame  of  the  pattern.  The  set  of  all  A-patterns  is  called  the 

support  of  A. 

Intuitively,  the  support  of  A  describes  all  possible  ways  of  proving  A.  From  now  on,  I  won't 
speak  of  individual  connectives  except  in  examples,  instead  dealing  generically  with  refutation 
frames  and  proof  patterns. 

Example  2.1.3.  Let  C  =  ->A  <g>  (-if?i  ©  -> Bf).  There  are  exactly  two  patterns  in  the  support  of  C\ 

• B\  II — B\  *f?2  II — '-B2 

•A  II — < A  *B\  II — B\  ©  —1B2  *A  II — < A  »f?2  II — B\  ©  — 1  Bo 

•A.»B]  lb  C  »A,  »B2  lb  C 

These  two  patterns  have  frames  »A,  •  /i j  and  »A.  •Bo,  respectively.  ■ 


2.1.2  The  definition  ordering 

The  notion  of  pattern  induces  a  more  abstract  definition  of  subformula  (cf.  [Takeuti,  1975]). 

Definition  2.1.4  (Definition  ordering).  We  write  A  -<  B  between  a  frame  and  a  proposition  if  there  is 
some  proof  pattern  A  lb  B,  and  write  A  -?  A  between  a  proposition  and  a  frame  if  there  is  some  refutation 
hole  »A  g  A.  The  definition  ordering  is  the  transitive  closure  of  -<. 
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Definition  2.1.5  (Definition  tree,  ancestors).  The  restriction  of  -<  belozv  A  (written  <,\)  is  called  the 
definition  tree  of  A.  Conceptually,  the  tree  grows  backwards,  " tozvards  a  simpler  time",  and  we  call  the 
(non-root)  elements  of  the  tree  the  ancestors  of  A.  Similarly,  we  write  ©a  for  the  restriction  of  -<  below 
A,  and  refer  to  the  elements  of  the  tree  as  A's  ancestors. 

We  can  think  of  the  ancestors  of  A  as  its  "abstract  subformulas".  For  example,  proposition  C  = 
-i A  0  (~<B i  ©  -1.B2)  has  abstract  subformulas  A,  B\,  and  £>2.  The  concrete  syntactic  subformulas 
-1  A,  “i  /i  1  ©  -1B2,  etc.,  are  not  ancestors  of  C  by  this  definition. 

Proposition  2.1.6.  For  any  proposition  A  (or  frame  A)  built  out  of  a  finite  combination  of  the  connectives 
1,0,®,©,^,  the  definition  tree  -<,4  (©a)  is  well-founded. 

Proof.  Every  negation  in  A  marks  a  branch  of  the  definition  tree.  Indeed,  if  .4  is  T  -free  then  it 
has  no  ancestors  (compare  this  to  the  usual  definition  of  subformula).  □ 

2.1.3  Proofs  and  refutations 

Suppose  we  have  defined  some  positive  connectives  and  their  proof  patterns.  Now  we  can 
explain  how  to  build  actual  proofs  and  refutations.  In  addition  to  the  judgments  A  and  »A,  we 
use  the  auxiliary  judgments  A  and  #.  Intuitively,  A  asserts  the  conjunction  of  all  its  hypotheses, 
while  #  asserts  contradiction.  Obviously  for  contradiction,  but  also  for  the  other  judgments,  we 
are  mainly  interested  in  reasoning  relative  to  a  context. 

Definition  2.1.7  (Contexts).  A  context  F  is  a  list  of  frames.  We  define  the  containment  relationship 
between  frames  and  contexts  as  follows: 

A  g  A’  A  £  r 
A  e  r,  A'  Aer,A' 

Since  a  frame  is  also  a  list  (of  refutation  hypotheses)  we  can  always  flatten  a  context  and  view 
it  as  a  frame — but  it  will  nonetheless  be  useful  to  have  a  conceptual  distinction  between  frames 
and  contexts.  We  write  T  h  J  to  assert  a  judgment  J  (any  of  A,  »A,  A,  or  #)  relative  to  context 
T.  We  now  explain  the  formal  meaning  of  these  judgments  with  inference  rules. 

A:  To  prove  A,  zve  must  choose  some  A-proof  pattern,  and  derive  its  frame. 

AH- A  r  h  A 

ri-i 

In  other  words,  if  a  proof  pattern  describes  a  proof  with  holes,  to  build  an  actual  proof  we  must 
fill  these  holes.  Observe  that  because  there  can  be  many  possible  ^4-patterns,  there  can  be  many 
possible  ways  to  apply  this  rule,  and  proving  A  requires  making  a  choice. 

•A:  To  refute  A,  zve  must  examine  every  possible  A-proof  pattern,  and  show  hozv  to  derive  a  contra¬ 
diction  from  its  frame. 

A  IF  A  — ♦  r,Ah# 

Th.d 

The  arrow  in  the  premise  means  that  to  every  possible  derivation  of  the  left-hand  side  (A  IF  A), 
we  must  give  a  derivation  of  the  right-hand  side  (T,  A  b  jf).  Note  that  there  is  only  one  possible 
way  to  apply  this  rule,  given  the  definition  of  A.  Since  the  propositions  defined  in  §2.1.1  all  have 
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finite  support,  we  can  view  this  rule  as  having  finitely  many  premises,  one  for  each  ^-pattern. 
In  general,  though,  we  would  like  to  give  this  rule  a  more  open-ended  interpretation,  and  view 
the  premise  as  literally  demanding  a  function  from  -4-patterns  to  contradictions,  rather  than  a 
simple  list  of  contradictions.1  We  will  say  much  more  about  this  in  Chapters  4  and  5. 

A:  To  assert  A,  we  must  supply  refutations  for  all  its  holes. 

T  b  Ai  T  h  A2 

fh  r  i-  (a1;  a2) 

These  rules  tell  us  explicitly  how  to  unravel  a  frame,  but  really  the  order  is  arbitrary  because 
frames  are  associative.  Formally,  we  have  the  following: 

Observation  2.1.8.  Any  derivation  of  T  F  A  determines  a  map  from  refutation  holes  »A  e  A  to 
derivations  T  b  »A,  and  conversely,  given  such  a  map  we  can  build  a  derivation  of  T  F  A.  In  other 
ivords,  the  two  rides  above  are  interchangeable  with  the  follozving: 

•A  €  A  — *  Th.d 

r  f  a 

To  derive  contradiction,  we  must  find  a  proposition  assumed  to  be  false,  and  prove  it. 

•a  s  r  r  f  a 
FF# 

Again,  there  can  be  many  possible  ways  of  applying  this  rule,  for  every  refutation  hole  in  the 
context. 

Example  2.1.9.  Let  C  =  -<A  <g)  (-iB i  ®  ~^B2)  as  in  Example  2.1.3.  By  applying  the  two  possible 
(7-patterns  and  instantiating  the  general  rules,  we  can  build  two  derived  rules  for  proving  C, 
one  for  each  C-pattern: 

r  f  r  f  »Bi  r  f  »a  r  f  »b2 

r  I — A  <g)  (-.Eh  ®  ->B2)  r  F -.A  ®  (-.Bi  ffl -cB2) 


Example  2.1.10.  Let  C  be  as  above.  The  derived  rule  for  refuting  C  has  two  premises: 

r,«A,.fliF#  r,»A,»Bo  F  # 
r  F  »-i A  <g>  (~^Bi  ®  — i B2) 


We  take  the  above  rules  to  be  canonical,  in  the  sense  that  they  enumerate  canonical  forms 
of  proof,  refutation,  etc.,  guided  completely  by  the  definition  ordering.  For  this  reason,  we 
explicitly  omit  rules  such  as: 

1  Our  notation  is  borrowed  from  Martin-Lof's  for  the  theory  of  iterated  inductive  definitions.  We  should  note 
that  this  inductive  definition  really  is  iterated  in  an  essential  way:  the  reason  it  makes  sense  to  quantify  over  proof 
patterns  is  because  they  have  already  been  given  an  inductive  definition,  prior  to  proofs.  The  reader  familiar  with 
the  work  of  Buchholz  et  al.  [1981]  might  notice  that  the  refutation  rule  bears  a  very  close  formal  resemblance  to  the 
so-called  fi-rule. 
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Frames  A 

Contexts  F 


::=  »A  |  ■  |  (Ai,  A2) 
■  |  T,  A 


A  IF  A  r  b  A  A\\-  A  — >  r,Ah# 

r  f  a  rh.A 

rhAi  rhA2  .ag  r  r  b  a 
rb-  r b  (Ar, a2)  rb# 


Figure  2.2:  Canonical  derivations  for  positive  propositions 

r  b  »a  r  b  a 
rb# 

which  would  allow  deriving  contradiction  by  picking  some  arbitrary  proposition,  and  showing 
that  it  has  both  a  proof  and  a  refutation.  Instead,  we  will  show  below  that  this  and  similar  rules 
are  admissible:  if  T  b  »A  and  I’  b  ,4  then  there  is  a  canonical  derivation  of  F  b  #  (i.e.,  one  that 
begins  by  finding  some  »B  e  F  and  showing  F  b  B).  We  summarize  the  definition  of  canonical 
derivations  in  Figure  2.2. 

2.1.4  Identity  and  composition 

How  do  we  know  that  canonical  derivations  are  a  reasonable  notion?  One  sanity  check  is 
that  they  satisfy  identity  and  composition  principles.  In  general,  an  identity  principle  for  a 
hypothetical  judgment  allows  us  to  derive  a  corresponding  conclusion  for  any  assumption.  Since 
a  context  T  can  be  seen  both  as  a  collection  of  refutation  holes  »A  or  of  arbitrary  frames  A,  we 
conceptually  distinguish  two  identity  principles: 

Identity  (refutation).  If  »A  s  T  then  T  b  »A 

Identity  (frame).  If  A  £  T  then  T  b  A 

We  say  that  these  are  the  identity  principles  respectively  on  A  and  on  A.  To  demonstrate  their 
validity,  we  give  a  pair  of  mutually  recursive  derivations.  From  »A  e  F,  we  derive  I’  b  »A  as 
follows: 


A  lb  A  r,A'bA 
•A  g  r,  A  r,  A  b  A 
A  lb  A  — ♦  r,  A  b  # 

rb  »a 

where  in  the  last  step  (from  the  bottom)  we  are  appealing  to  frame  identity.  Frame  identity  is 
trivial,  since  it  just  expands  into  a  list  of  refutation  identities: 


:  rb’Ai  rb'A2 

rb'.i  rb-  rbAi,A2 
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These  derivations  use  a  few  trivial  properties  of  the  containment  relation,  namely  that  •  A  G  T 
implies  *A  G  T,A,  and  that  Ai,A2  G  T  implies  Ai  G  T.  Otherwise,  though,  coming  up  with 
them  was  a  mechanical  exercise — these  are  really  the  only  possible  generic  derivations,  given 
the  rules  of  Figure  2.2.  But  do  they  make  sense  as  derivations?  Are  they  well-founded?  The 
answer  depends  on  the  definition  ordering. 

Theorem  2.1.11.  The  identities  on  A  arid  A  are  well-founded  just  in  case  -<a  and  -<a  are  well-founded. 

Proof.  The  recursive  calls  between  the  two  derivations  precisely  mirror  the  definition  ordering. 

□ 

As  we  saw  in  Proposition  2.1.6,  the  definition  ordering  is  always  well-founded  if  we  consider 
only  the  boolean  logical  connectives.  Later,  when  we  study  arbitrary  recursive  types,  this  will 
no  longer  be  the  case,  and  in  order  to  make  sense  of  the  identity  principles  we  will  have  to  move 
to  a  coinductive  interpretation  of  derivations. 

In  general,  a  composition  principle  is  a  way  of  combining  two  related  inferences  into  a  single 
inference.  We  have  two  ways  of  composing  canonical  derivations  (recall  that  J  ranges  over 
judgments  A,  »A,  A',  or  jf): 

Composition  (reduction).  1/  T  F  A  and  T  b  «A  then  T  F  # 

Composition  (substitution).  IfT  F  A  and  T(A)  F  J  then  T  F  J 

In  the  substitution  principle,  the  standard  notation  T(A)  indicates  a  context  with  A  plugged 
somewhere  into  T.  To  be  more  explicit,  it  says  that  T  can  be  rewritten  as  the  concatenation  of 
two  contexts  T  =  Ti,T2,  and  that  T(A)  =  Ti,  A,T2- 

We  say  that  the  these  are  the  composition  principles  on  A  and  on  A,  respectively.  Again,  we 
illustrate  the  validity  of  the  composition  principles  with  a  pair  of  mutually  recursive  definitions. 
Note  that  we  require  an  additional  principle,  that  it  is  always  possible  to  weaken  a  canonical 
derivation  with  an  additional  frame. 

Weakening.  lfT\-J  then  T(A)  F  J 

Proof.  Immediate  by  induction  on  the  derivation  of  T  F  J,  using  the  properties  of  the  containment 
relation.  □ 

Now,  suppose  we  have  canonical  derivations  of  T  F  A  and  T  F  •  A  These  must  have  the  following 
form: 

A  IF  A  TF  A  A  IF  A  — >  T,AF# 

T  F  A  TF.A 

By  plugging  in  the  A-pattern  from  the  proof  of  A  into  the  premise  of  the  refutation  of  A,  we  get 
a  derivation  of  T,  A  F  #.  Then  we  derive  T  F  #  by  composing  with  the  derivation  of  T  F  A. 

Suppose  we  have  derivations  of  T  F  A  and  T(A)  F  J.  We  consider  the  possible  shapes  of  the 
latter: 

A'  IF  A  T(  A)  FA'  A'  IF  A  — *  T(A),  A'  F  # 

T( A)  F  A  T(A)  F  .A 

T(A)  F  Ai  T( A)  F  A2  *A  g  T(A)  T(A)  F  A 
r(A)  f  •  T(A)F(A1,A2)  r(A)  f  # 
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Most  of  these  cases  are  trivial:  we  just  apply  the  composition  principle  recursively  to  the  sub¬ 
derivations,  and  re-apply  the  rule  (in  the  case  of  the  refutation  rule,  we  must  also  invoke  weak¬ 
ening  on  the  first  derivation).  The  only  interesting  case  is  the  contradiction  rule,  when  we  have 
•A  E  T(A)  by  virtue  of  »A  E  A.  Since  T  h  A  and  »A  E  A  implies  T  h  »A  (Observation  2.1.8), 
after  performing  substitution  on  T(A)  h  A  to  obtain  T  b  A,  we  can  apply  reduction  to  obtain 

r  t-  #. 

Again,  we  must  verify  that  these  definitions  are  well-founded. 

Theorem  2.1.12.  The  compositions  on  A  mid  A  are  well-founded  just  in  case  and  -<&  are  well- 
founded. 

Proof.  As  with  the  identity  principle,  the  recursive  calls  between  the  two  forms  of  composition 
precisely  mirror  the  definition  ordering.  □ 

2.1.5  Complex  frames 

We  have  seen  that  the  identity  and  composition  principles  can  be  justified  on  a  generic  basis, 
by  taking  advantage  of  the  uniform  definition  of  connectives  by  patterns.  This  is  interesting 
from  a  philosophical  perspective,  in  the  sense  that  it  gives  a  "justification"  of  the  logical  laws  in 
the  style  of  Prawitz/Dummett/Martin-Lof,  exploiting  a  general  inversion  principle  for  positive 
propositions.  It  is  also  interesting  from  a  purely  proof-theoretic  perspective,  because  as  we  will 
see  in  the  next  chapter,  the  composition  of  two  canonical  derivations  corresponds  exactly  to 
the  cut  of  two  sequent  calculus  proofs,  so  our  generic  composition  theorem  is  also  a  generic 
cut-elimination  theorem.  Although  attempts  at  giving  generic  criteria  for  cut-elimination  have 
been  made  before,2  it  is  still  a  common  belief  that  proving  cut-elimination  theorems  requires  a 
tedious  case  analysis  on  all  possible  matchings  of  the  rules  for  the  different  connectives — which 
we  entirely  avoided.  And  as  we  will  see  in  Chapter  4,  this  pattern-based  justification  of  identity 
and  cut  also  has  a  deep  computational  significance,  giving  us,  for  example,  a  generic  proof  of 
type  safety  for  a  programming  language. 

So  it  is  nice  that  we  have  available  this  sort  of  generic  justification.  On  the  other  hand,  when 
presenting  a  particular  derivation,  there  are  times  when  we  don't  care  about  the  justification. 
Suppose,  e.g.,  that  we  want  to  derive  »A  ©  A  A  b  #.  A  canonical  derivation  proceeds  like  so: 

Id 

Alb  A  mA  ©  -iA,  A  b  A 

- -  *i 

•A  ©  —<A,  A  b  A  ©  — i A 

A  lb  A  — »  *A  ©  —i  A,  Ah# 

- - - —  *o 

•A  ©  -iA  b  A  0  -pA 

•A  ©  — i A  b  7# 

where  Id  marks  the  (frame)  identity  principle,  and  *i  and  *2  indicate  the  two  derived  rules  for 
proving  A  ©  -> A: 

A  lb  A  T  b  A  *  A  lb  A  — >  T,Ab# 

T  b  A  ©  -A  1  T  b  A  ©  -1 A  2 

2See  in  particular  the  recent  work  of  Ciabattoni  and  Terui  [2006],  which  gives  a  survey  of  prior  work. 
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The  *2  rule  quantifies  over  all  patterns  A  lb  A,  yet,  in  the  derivation  of  »A  0  ~>A  b  #,  we  don't 
actually  perform  any  analysis  of  these  patterns,  simply  reusing  them  to  build  a  proof  of  A. 
It  might  be  said,  then,  that  including  the  patterns  explicitly  in  the  derivation  is  unnecessary 
notational  overhead. 

Suppose  that  we  could  instead  simply  place  a  hypothesis  A  in  the  context  to  stand  abstractly 
for  this  quantification  over  patterns.  Then  we  might  give  a  notationally  friendlier  presentation 
of  the  same  canonical  derivation  as  follows: 


Id 

• A  0  ~>A,  A  b  A  t 
•A0nA,Ab  A®^A 
•A  0  -'A,  A  b  ft  t 
»A0nAbA0nA  *2 
•A  0  -iA  b  -ft 

Having  this  shorthand  notation  will  be  very  convenient  as  we  work  with  canonical  derivations, 
so  let  us  introduce  it  formally. 

Definition  2.1.13  (Complex  frames).  A  complex  frame  can  contain  complex  proof  hypotheses  A 

in  addition  to  refutation  holes  »A.  We  distinguish  frames  that  don't  contain  any  complex  hypotheses  as 

simple. 

We  call  these  frames  complex  because  they  can  always  be  decomposed  into  simple  frames.  The 
only  way  to  use  a  complex  hypothesis  inside  a  frame  is  to  perform  a  case  distinction  on  patterns: 

A  lb  A  — ♦  T(A)  b  J 
T(A)  b  J 

For  example,  we  can  apply  this  rule  to  derive  the  identity  principle  for  complex  proof  variables: 

Id 

A  lb  A  T(A)  b  A 
A  lb  A  — ♦  T(A)  b  A 

T{A)  b  A 

This  might  seem  backwards — isn't  the  point  of  of  complex  hypotheses  that  we  don't  have  to 
analyze  them?  Formally,  what  we  have  to  observe  is  that  the  above  rule  is  invertible,  i.e.,  its 
conclusion  implies  its  premise. 

Proposition  2.1.14  (Pattern  substitution).  IfT(A)  b  J  and  A  lb  A  then  r(A)  b  J. 

Proof.  Trivial,  by  walking  through  the  canonical  derivation  to  find  the  place,  if  any,  where  the 
complex  hypothesis  is  used,  and  substituting  the  pattern  into  the  premise.  □ 


So,  we  are  justified  in  reading  the  above  rule  as  bidirectional, 

A  lb  A  — *  T(A)  b  J 
T(A)  b  J 


15 


Frames  A 


A 


A  lb  A  — >  r(A)FJ 
r(A)  f  j 


Figure  2.3:  Complex  proof  hypotheses 


but  it  is  important  to  understand  that  we  are  not  actually  adding  a  new  rule  going  in  the  upside- 
down  direction.  When  presenting  a  canonical  derivation,  we  are  permitted  to  introduce  new 
complex  variables  precisely  because  they  can  be  analyzed  away — just  as  we  are  justified  in  using 
the  identity  and  composition  principles  because  they  can  be  eliminated.  Note  that  we  do  have 
to  verify  that  the  composition  principles  still  hold  in  the  presence  of  complex  hypotheses,  but 
this  is  trivial,  because  we  can  always  apply  pattern  substitution  to  bring  the  case  analysis  to  the 
front  and  compose  simple  subderivations. 

Again,  let's  look  at  the  second,  prettier  derivation: 

Id 

• A  ©  - A,  A  b  A  , 

•A  ©  ->A,  A\~  A®  -i  A 
•A  ©  —i A,  A  F  / 

»A©ndhi©^i  *2 
•A  ©  -A  I-  # 

Step  *2  in  the  derivation  is  an  admissible  step,  rather  than  a  canonical  rule.  Step  *\  is  likewise 
only  an  admissible  step  (because  the  derived  rule  *1  actually  requires  us  to  go  all  the  way  down 
to  a  pattern  for  A). 

Finally,  let  us  mention  one  other  way  of  understanding  complex  hypotheses.  When  describing 
the  canonical  rule  of  refutation  in  §2.1.3,  we  deliberately  left  open-ended  the  ways  in  which 
functions  from  proof  patterns  to  contradictions  could  be  constructed.  By  introducing  complex 
hypotheses,  we  are  making  explicit  a  particular  form  of  construction,  whereby  functions  are 
simply  defined  by  substitution,  rather  than  case  analysis.  Because  this  is  such  a  ubiquitous  form 
of  definition,  it  seems  worthwhile  to  treat  it  explicitly. 

2.2  A  refutation-biased  logic 

We  repeat  the  development  of  §2.1,  but  with  everything  reversed. 

2.2.1  Proof  frames,  refutation  patterns,  definition  ordering 

In  this  section  I  use  the  letters  A,  B,  C  to  range  over  negative  propositions. 

Definition  2.2.1  (Frames  of  proof  holes).  Frames  are  now  taken  to  be  lists  of  hypotheses  A\, . . . ,  An. 
The  Ai  e  A  are  called  proof  holes. 

Negative  connectives  are  defined  by  the  judgment  A  IF  »A,  which  asserts  that  we  can  build 
a  refutation  of  A  leaving  holes  for  premises  A.  For  example,  we  give  a  negative  definition  of 
conjunction  as  follows: 
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(no  rule  for  T) 


A  Ih  rnA  A  Ih  »B 
A  Ih  »A&B  A  Ih  »A&B 

Ai  Ih  »A  A2  Ih  »B 
■  Ih  •_!_  Ai,  A2  Ih  •A'gB 


AW  »=<A 


Figure  2.4:  Definition  of  some  negative  connectives  by  refutation  patterns 

A  Ih  »A  A  Ih  mB 

(no  rule  for  T)  A  Ih  »A8<B  A  Ih  •A&E? 

Intuitively  this  says  that  there  is  no  refutation  of  truth,  and  to  refute  A&B  we  can  refute  either 
A  or  B.  Negative  disjunction  is  defined  as  follows: 

_  Aj  Ih  »A  A2  Ih 

■  Ih  •_!_  Ai,  A2  Ih  •A’^B 

This  says  that  falsehood  is  directly  refutable,  while  a  refutation  of  the  disjunction  A'gB  requires 
refutations  of  both  A  and  B,  combining  their  respective  premises.  And  finally,  negative  polarity 
negation  is  defined  by  the  axiom  A  Ih  •  ^  A 

Definition  2.2.2  (Refutation  patterns).  A  derivation  of  A  Ih  »A  is  called  a  refutation  pattern,  or 
more  specifically  an  A-pattern.  We  use  the  letter  d  to  range  over  refutation  patterns.  As  we  did  with 
proof  patterns,  we  refer  to  the  frame  A  as  the  frame  of  d,  and  to  the  set  of  all  A-refutation  patterns  as  the 
support  of  A. 


Example  2.2.3.  Let  C'  =  -1  A&(- ■  B\  h?  -1  B2).  There  are  exactly  two  C' -patterns: 

B\  I h  •  — 1  B\  B2  Ih  •  — ■  B2 
A  Ih  •  — 1  A  B\,  B2  I h  •  — 1  B2*3  ~ 1 1  B2 

A  Ih  «C"  BuB2  Ih  •C 


The  definition  ordering  for  negative  propositions  is  defined  completely  analogously  to  the  pos¬ 
itive  case:  A  B  if  there  is  some  refutation  pattern  A  Ih  »B,  and  A  A  it  there  is  some  proof 
hole  A  E  A. 

2.2.2  Proofs  and  refutations,  identity  and  composition,  complex  hypotheses 

We  follow  the  template  of  §2.1.3,  but  reverse  the  roles  of  proof  and  refutation.  Explicitly,  we 
define  the  four  judgments  with  the  following  canonical  rules  (summarized  in  Figure  2.5): 

•A:  To  refute  A,  we  must  choose  some  A-refutation  pattern,  and  derive  its  frame. 

A  Ih  »A  r  h  A 
r  h  .a 
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Frames  A 

Contexts  T 


"=  A\-\  (A1;A2) 
::=  ■  |  T,  A 


A  IF  »A  — >  r,AF# 
fKA 


A  IF  »A  rhA 

rh.i 


rhAi  r  f  a2 
tV~-  r  h  (Ar, a2) 


a  g  r  rh.i 

rh# 


Figure  2.5:  Canonical  derivations  for  negative  propositions 


A:  To  prove  A,  we  must  examine  every  possible  A-refutation  pattern,  and  shoiv  how  to  derive  a 
contradiction  from  its  frame. 


A  IF  mA 


T,AF# 


r  f  a 

A:  To  assert  A,  we  must  provide  evidence  for  all  of  its  proof  holes. 

r  f  Ai  r  f  a2 
rF(Ai,A2) 

To  derive  contradiction,  we  must  find  some  hypothesis  assumed  to  be  true,  mid  refute  it. 

AeT  r F  »a 

FF# 

It  is  important  to  understand  that  for  negative  propositions,  proof  means  proof-by-contradiction, 
whereas  refutation  must  be  direct.  This  is  dual  to  the  situation  for  positive  propositions,  where 
proof  must  be  direct,  while  refutation  is  by  contradiction. 

Example  2.2.4.  Let  C'  be  as  in  Example  2.2.3.  The  derived  rules  for  proving  and  refuting  C  are: 

T,F1F#  T,Bi,Bo\~  #  TFF1  T  F  B1  T  F  B2 
TFC  F  F  mC'  T  F  •C' 

Let  us  compare  these  rules  with  those  for  the  positive  proposition  C  =  ->A  <g)  {~^B\  ©  ^ /i2 ) : 

F  F  »A  T  F  »Bi  TF»d  TF.52  T,»A,»B1h#  T,*A,*B2 F# 


TFT 


TFC 


TF.C 


The  positive  C  and  negative  C'  are  both  legitimate  interpretations  of  the  unpolarized  proposition 
-i  A  A  (  i  /L  |  V  -i Bf),  but  as  we  see  they  result  in  different  rules  of  proof  and  refutation. 


For  this  new  notion  of  canonical  derivations,  we  can  state  identity  and  composition  principles 
analogous  to  those  of  §2.1.4: 


Identity  (proof).  If  A  G  T  then  T  F  A 
Identity  (frame).  If  A  G  T  then  T  F  A 
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Composition  (reduction).  IfY\~A  and  Y  b  »A  then  T  h  # 

Composition  (substitution).  IfT  h  A  and  T(A)  b  J  then  T  b  J 

The  proof  of  these  principles  is  likewise  completely  analogous,  conditioned  on  well-foundedness 
of  the  definition  ordering. 

Indeed,  there  is  an  obvious  bijection  between  the  two  forms  of  canonical  derivations.  Define 
the  dual  A°  of  a  formula  A  as  follows: 

1°  =  X 
0°  =  T 

(A@B)°  =  A°&B° 

(A  <g>  B)°  =  A°*gB° 

(“i  A)°  =  AA° 

the  dual  of  an  assertion/ denial  as  follows: 

(A)°  =  *^4°  (.bL)°  =  A° 

and  set  #°  =  #,  and  extend  (— )°  to  frames  pointwise.  Note  that  (— )°  is  an  involution  on  both 
propositions  and  judgments. 

Proposition  2.2.5  (Duality). 

1.  A  lb  A  iff  A°  lb  »A° 

2.  r  b  j  iff  r  b  j° 

3.  r(A)  b  J  iff  r(A°)  b  J 

Proof.  (1)  is  immediate  from  the  definition  of  patterns.  From  (1),  we  derive  (2)  and  (3)  by 
mutual  induction  on  the  derivations.  For  example,  suppose  we  have  a  refutation  of  a  positive 
proposition: 

A  lb  A  — *  T,Ab# 

r  b  »a 

Given  a  refutation  pattern  A  lb  »A°,  by  (1)  and  the  fact  that  .4°°  =  A  we  obtain  a  proof  pattern 
A°  lb  A,  then  a  derivation  of  T,  A°  b  #  by  the  premise,  and  hence  F,  A  b  #  by  (3)  and  the  fact 
that  A°°  =  A.  Therefore  T  b  A°.  □ 

Finally,  just  as  we  introduced  complex  proof  hypotheses  to  simplify  the  presentation  of  deriva¬ 
tions  in  proof-biased  logic,  here  we  can  introduce  complex  refutation  hypotheses  »A.  Complex 
refutation  variables  are  used  with  the  following  rule,  which  is  invertible: 

A  lb  »A  — *  T(A)  b  J 

r(«A)  b  j 


1°  =  1 
T°  =  0 

(. A&B)°  =  A°®B° 
(A>pB)°  =  A°  ®B° 
(4  4)°  =  ^  A° 
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Frames  A 

Contexts  F 


"=  »A  |  A  \  ■  |  (Ai,  A2) 
"=  ■  |  T,  A 


A  IF  A+  T  F  A  A  IF  A+  — >  r,AF# 

r  |-  a+  r  |-  %a+ 

A  IF  »A~  — >■  r,  A  h  #  A  IF  »A~  T  F  A 
r  F  Th»A- 

•A+  er  Th  A+  r  h  Ai  r  f  a2  a-  e  r  r  h  »a~ 

tf#  rh-  r  p  (A1}  a2)  rp# 


A  IF  A+  — >  T(A)  F  J  A  IF  »A~  — >  T(A)  F  J 
T(A+)  F  J  r(.A-)  F  J 


Figure  2.6:  Canonical  derivations  in  polarized  logic,  with  complex  hypotheses 


2.3  Propositional  polarized  logic 

2.3.1  A  unified  view 

We  have  shown  two  different  ways  of  defining  logical  connectives:  either  positively  by  their  proof 
patterns,  or  negatively  by  their  refutation  patterns.  Now  we  explain  how  these  alternatives  are 
not  incompatible,  in  the  sense  that  they  define  different  fragments  of  a  single,  polarized  logic. 
From  now  on  I  use  the  letters  A,  B,C  to  range  over  polarized  propositions,  which  have  a  definite, 
positive  or  negative  polarity.  Positive  polarity  is  marked  explicitly  by  writing  A+ ,  negative 
polarity  by  writing  A~,  but  this  annotation  can  also  be  left  out  when  the  polarity  is  clear  from 
context. 

Frames  can  now  contain  holes  for  both  proofs  and  refutations,  and  connectives  are  defined 
either  by  their  proof  patterns  or  by  their  refutation  patterns.  Canonical  derivations  in  polarized 
logic  are  formed  by  combining  the  inference  rules  for  positive  and  negative  logic,  as  summarized 
in  Figure  2.6.  We  include  in  the  figure  both  rules  for  using  complex  hypotheses. 

We  likewise  combine  the  identity  and  composition  principles. 

Identity  (proof).  If  A  e  T  then  T  F  A 

Identity  (refutation).  If  »A  s  T  then  T  F  »A 

Identity  (frame).  If  A  e  T  then  T  F  A 

Composition  (reduction).  IfT  F  »A  and  T  F  A  then  T  F  # 

Composition  (substitution).  IfT(A)  F  J  and  T  F  A  then  T  F  J 

Again,  these  are  verified  by  an  argument  completely  analogous  to  that  of  §2.1.4,  conditioned  on 
well-foundedness  of  the  definition  ordering. 

If  you  have  been  paying  close  attention  to  these  definitions,  however,  you  will  notice  that 
this  is  still  only  a  trivial  combination  of  two  logics.  All  of  the  connectives  defined  in  §2.1.1  and 


20 


§2.2.1  preserve  polarity,  which  means  that  a  positive  proposition  has  only  positive  ancestors, 
and  a  negative  proposition  only  negative  ancestors.  Since  the  structure  of  canonical  derivations 
mirrors  the  definition  ordering,  there  is  no  real  interaction  between  the  two  fragments.  But 
what  makes  polarized  logic  non-trivial  is  that  we  can  define  additional  connectives  that  change 
polarity.  In  particular,  we  will  find  the  following  pair  of  connectives  most  interesting: 

A-  lb  [A  »A+  Ih  •  t-4 

|  coerces  a  negative  proposition  into  a  positive  one,  and  |  coerces  a  positive  proposition  into  a 
negative  one.  Note  that  the  polarity  annotations  on  the  left-hand  sides  of  the  pattern  axioms 
are  not  connectives,  they  are  only  to  emphasize  the  polarity  flip.  When  we  don't  care  about  the 
polarity  of  A  and  of  the  resulting  shift,  we  simply  write  \A.  These  connectives — pronounced 
"down  shift"  and  "up  shift",  or  just  "shift" — may  at  first  seem  logically  vacuous,  particularly  to 
a  classical  logician.  But  note  that  A  is  always  an  ancestor  of  \A,  and  for  this  reason  the  shifts 
actually  have  an  interesting  effect  on  canonical  derivations  and  on  the  identity  and  composition 
principles.  In  a  sense  we  can  view  the  shifts  as  modalities,  mediating  between  the  constructively 
weaker  (more  permissive)  notion  of  negative  proof  (by-contradiction)  and  the  stronger  notion  of 
positive  (direct)  proof,  and  between  the  weaker  notion  of  positive  refutation  (by-contradiction) 
and  the  stronger  notion  of  negative  (direct)  refutation. 

Proposition  2.3.1.  The  shift  connectives  have  the  following  derived  rules  of  proof  and  refutation: 

Thd-  r,A-\~#  r,.A+h# 

ThjA  T\~»IA  Th^A  TI-«TA 

Because  these  are  the  only  canonical  rules  of  proof /refutation  for  shifted  proposition,  the  fol¬ 
lowing  instances  of  composition  (reduction). . . 

1.  If  T  h  [A  and  T  h  •  [A  then  T  b  #. 

2.  If  T  h  and  T  h  •  | A  then  T  h  #. 

. . .  reduce  immediately  to  the  following  instances  of  composition  (substitution)  on  singleton 
frames: 

1.  If  T  h  A~  and  F,A~  b  #  then  T  b  #. 

2.  If  T,  •A+  b  #  and  T  b  «A+  then  T  h  #. 

In  Chapter  4,  we  will  revisit  these  principles  from  a  computational  perspective. 

Besides  the  shift  connectives,  we  can  define  two  additional  connectives  that  mix  polarity: 

Ai  Ih  A+  A2  Ih  mB~  Ai  Ih  A+  A2  Ih  »B~ 

Ai,  A2  Ih  »A+  -*•  B~  Ai,A2lh  A+-B~ 

Implication  A  B  is  defined  as  a  negative  connective:  to  refute  an  implication  A  — >  B,  we  give 
a  proof  of  A  and  a  refutation  of  B.  The  connective  A  —  B  is  the  dual  of  implication  familiar 
from  subtractive  logic  [Crolard,  2001]:  its  proof  conditions  are  exactly  the  same  as  the  refutation 
conditions  for  A  —*  B.  Note  that  the  polarities  of  the  subcomponents  A+  and  B  ensure  that 
we  can  continue  to  decompose  their  proof/ refutation  patterns,  and  also  that  A  — >  B  and  A  —  B 
have  the  same  ancestors,  exactly  the  union  of  the  ancestors  of  A  and  B. 
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A~  lb  [A  »A+  II-  •  ]A 


Ai  Ih  A+  A2  Ih  %B~  Ai  Ih  A+  A2  Ih  mB~ 
Ai,  A2  Ih  A+  -  B  Ai,  A2  Ih  *A+  -»■  B~ 


Figure  2.7:  The  polarity  mediating  connectives 


Proposition  2.3.2.  The  following  rules  for  A  — >  B  and  A  —  B  are  admissible  (double-lines  indicating 
bidirectionality): 


r ,*b-  h  »a+ 
r,  a+  h  b  = 
r  h  a  ->■  b 


r  h  a  r  h  »b 
r  h  »a  ->  b 


ThA^5  T  h  »A  ->  B 

r  h  »a-  b  r  h  a-  b 


Note  that  the  usual  involutive  negation  of  linear  logic  can  be  defined  in  terms  of  implication  and 
subtraction:  the  dual  of  a  positive  proposition  A+  is  defined  as  .4  =  A  — »  _L,  while  the  dual  of 

a  negative  proposition  B  is  defined  as  B  =  1  —  B. 

Proposition  2.3.3  (Involution). 

1.  A  Ih  1  -  (A+  — >  _L)  iff  Alh  A+ 

2.  A  Ih  •(!  -  A~)  1  iff  A  Ih  »A~ 


2.3.2  Atomic  propositions 

We  have  assumed  so  far  that  all  propositions  are  constructed  out  of  the  polarized  connectives, 
but  we  should  also  consider  indecomposable,  atomic  propositions.  We  use  the  letters  X,  Y,  Z  to 
stand  for  atomic  propositions,  and  keeping  the  assumption  that  all  propositions  have  a  definite 
polarity,  write  X+,  Y~,  etc.,  to  indicate  the  polarity  of  an  atom. 

To  include  atoms  in  canonical  derivations,  we  add  a  pair  of  pattern  rules: 


A+  Ih  A+  •A”  Ih  .X~ 

and  a  pair  of  rules  for  satisfying  the  atomic  hypotheses  in  a  frame: 

X+  6  T  »X~  £  T 

Thl+  T  h  •X- 

In  other  words,  the  only  way  to  reason  about  atoms  is  axiomatically. 

The  identity  principle  (if  X+  <G  T  then  T  h  X+,  and  if  «X  G  T  then  T  h  »X”)  is  therefore 
trivial  for  atomic  hypotheses,  as  is  composition:  since  the  only  way  a  derivation  of  T(A)  h  J 
can  use  an  atomic  hypothesis,  e.g.,  X+  £  A  is  to  show  T(A)  h  X+,  but  T  h  A  already  implies 

T  h  X+. 
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Frames  /t  ::=  ■  •  •  I  X+  I  »X~ 


X+  IF  A+  mX~  IF  *X~ 


x+  g  r  »x+  e  r 
r  f  x+  r  f  *x- 


Figure  2.8:  Atomic  propositions 

Definition  2.3.4.  The  propositions  of  propositional  polarized  logic  (PPL)  are  built  out  of  polarized 
atoms  using  any  finite  combination  of  the  aforementioned  positive  connectives  (1, 0,  <g),  ©,  ^ ),  negative 
connectives  (T,  _L,  &,  >??,  -=i ),  and  mixed  polarity  connectives  (£,  — — ). 

Because  of  the  triviality  of  the  identity  and  composition  principles,  we  do  not  include  atoms  in 
the  definition  ordering,  i.e.,  the  definition  trees  of  X+  and  X~  are  empty 

Proposition  2.3.5.  For  any  proposition  A  in  PPL,  and  any  frame  A  built  out  of  PPL  propositions,  the 
definition  trees  -<a  and  -<a  we  well-founded. 

Theorem  2.3.6.  The  identity  and  composition  principles  are  admissible  on  all  propositions  of  PPL. 

Proof.  A  corollary  of  Proposition  2.3.5  and  the  generalization  of  Theorems  2.1.11  and  2.1.12  to 
derivations  of  polarized  logic.  □ 


2.3.3  The  entailment  relation(s) 

One  of  the  traditional  views  of  logic  is  as  a  partial  order  on  propositions,  i.e.,  an  entailment 
relation.  We  have  given  an  explanation  of  inference  rules  and  canonical  derivations  in  polar¬ 
ized  logic,  but  have  not  really  discussed  entailment  for  polarized  propositions.  An  excuse  for 
this  omission  is  that  there  are  actually  two  different  canonical  ways  of  defining  entailment.  Es¬ 
sentially,  we  can  either  define  entailment  "positively"  by  A  F  B,  or  "negatively"  by  »B  F  •  A 
That  is,  we  can  ask  whether  a  proof  of  A  implies  a  proof  of  B,  or  whether  a  refutation  of  B 
implies  a  refutation  of  A.  In  general,  these  two  notions  of  entailment  do  not  coincide:  when  the 
antecedent  is  positive,  positive  entailment  is  a  stronger  requirement  than  negative  entailment, 
whereas  when  the  consequent  is  negative  the  opposite  is  the  case. 

Definition  2.3.7.  We  say  that  A  positively  entails  B  (A  <+  B)  if  A  F  B. 

Definition  2.3.8.  We  say  that  A  negatively  entails  B  (A  <~  B)  if  »B  F  »A 

Proposition  2.3.9.  A+  <+  B  implies  A+  <~  B  (for  arbitrary  polarity  B),  and  A  <~  B~  implies 
A  <+  B~  (for  arbitrary  polarity  A). 

Proof.  We  derive  A+  <~  B  from  A+  <+  B  as  follows: 

Id  A+  <+  B 

•B,  A+  F  »B  »B,A+\-B 

*B,A+P#  1 

•B  F  »A+ 
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where  at  step  (f)  we  are  applying  composition  (reduction).  To  derive  A  <+  B  from  A  <  B 
we  reason  dually.  □ 

Corollary  2.3.10.  A+  <+  B~  iff  A+  <“  B~. 

Because  positive  entailment  is  always  the  stronger  relationship  between  two  positive  proposi¬ 
tions,  and  negative  entailment  always  the  stronger  relationship  between  negative  propositions, 
we  say  that  an  entailment  A  <  B  between  like-polarity  propositions  holds  strongly  if  A  and  B 
both  have  polarity  p  and  A  <p  B,  or  weakly  if  A  <~p  B.  Because  the  two  forms  of  entailment  co¬ 
incide  when  the  antecedent  is  positive  and  the  consequent  negative,  we  simply  write  A+  <  B~ 
without  further  specification.  Note  that  when  the  antecedent  is  negative  and  the  consequent 
positive,  in  general  neither  form  of  entailment  implies  the  other.  For  example,  we  have  T  <+  1 
(because  1  has  a  trivial  proof)  and  T  1  (because  T  is  irrefutable),  but  also  _L  <_  0  (because 
_L  has  a  trivial  refutation)  and  _L  j£+  0  (because  0  is  unprovable).  For  arbitrary  polarities  and 
forms  of  entailment,  we  write  A  =  B  if  both  A  <  B  and  B  <  A. 

Proposition  2.3.11.  <+  and  <~  are  reflexive  and  transitive. 

Proof.  Immediate  by  the  identity  and  composition  principles.  □ 

Proposition  2.3.12.  The  following  rides  of  entailment  between  like-polarity  propositions  are  valid  strongly 
(and  hence  also  weakly): 

A\  <  B\  A2  <  B2 

Zi  <g>  A2  <  Bi  <g>  B2  A®  (B  ®  C)  =  (A®  B)  ®C  A®B=B®A  A  =  A®1 

A®B<A  A®  B  <  B  A<  A®  A 

A  <  C  B  <  C 

0<A  A<A®B  B  <  A®  B  A®  B  <  C 
A  <g>  (B  ®  C)  <  (A  <g>  B)  ©  (A  <g>  C)  A®0<B 
A  <  B  A  <  C 

A  <  B&C  A&B  <  A  A&B  <  B  A<  T 

A^A  <  A  A<  A^B  B  <  A^B 

A\  <  B\  A2  <  B2 

yl^_L  =  A  A*8B  =  B^A  (A>$B)>$C  =  A^(B^C)  A^A2  <  Bi*8B2 
A  <  B*8 T  (A^B)&{A^C)  <  A^(B&C) 

A  <  B  B  <  A 

A-^B  =  Al^8B  A  —  B  =  A®  Bl  \A<\B  -A.  <  B 

Proof.  Routine  calculation  from  the  definition  of  the  connectives  (Figures  2.1,  2.4,  2.7).  Note  that 
transitivity  implies  that  the  following  rules  for  <g),  1,  >£?,  _L  are  also  valid: 

A  <  B  A  <  C  A  <  C  B  <  C 

A<  B®C  A<1  _L  <  A  A®B  <  C 
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□ 


Definition  2.3.13  (Galois  connection,  cf.  [Gierz  et  al.,  2003]).  Let  X  and  y  be  two  partially  ordered 
sets.  A  (monotone)  Galois  connection  f  -\  g  is  a  pair  of  monotone  functions  f  :  X  —>  y  and  g  :  y  — >  X 
such  that  x  <  g(y)  iff  f(x)  <  y  .  An  antitone  Galois  connection  between  X  and  ,y  is  a  monotone  Galois 
connection  between  X  and  yop. 


Proposition  2.3.14.  The  following  (monotone/antitone)  Galois  connections  are  valid  (indicating  the  po¬ 
larities  of  A,  B,  and  C  as  needed  for  clarity): 


B+  <+  A  A 
A+  <+  A  B 


>B<~  A- 
■  /l  <  B 


\A  <~  B 
A'  <  B 


A®  B  <  C~ 


A+  <  B~*gC- 


A+  <+  [B  A+  <B^C  A+-B-  <  C~ 


B<A± 

B1-  <  A 

A<Bl 

AL  <B 

We  illustrate  the  Galois  connection 

<“  B~ 

iff 

•B~  A  •'l A 

iff 

•B~  F  »A+ 

iff 

• B~,A+A # 

iff 

A+  h  B~ 

iff 

21+  h  [B 

iff 

21+  <+  [B 

□ 

Corollary  2.3.15.  A-1-1-  =  A. 

Proposition  2.3.16.  The  following  entailments  are  only  weakly  valid: 

1<A®AA  AAA<A  i]A<A  A<flA  A  <  •=.  •=.  A  A&^  A  <  ± 

Proof.  First,  we  show  the  three  entailments  on  the  left  are  weakly  (i.e.,  negatively)  valid  (the 
argument  for  the  three  on  the  right  is  dual). 

1.  1  <“  A®  A  A\  as  in  §2.1.5. 

2.  The  following  derivation  shows  A  A  A  <~  A: 

Id 

•A,  »~A  A  »A 
•A,  9—iA  I — A 
•A,  »—<A  h 
•A  h  •— i— .^4 

applying  the  derived  rule  for  proving  T  A: 
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r  h  .a 
r  b  t  a 


* 


3.  The  following  derivation  shows  11 A  <  A: 


Id 

•A,  JA  h  «A 
•A,  |A  h  A 

•A\-  •  HA 

applying  the  derived  rules  for  refuting  j  and 
Next,  we  observe  that  these  entailments  are  not  strongly  (i.e.,  positively)  valid. 

1.  l<+X©^Xiff  either  •  h  X  or  •  h  *X,  but  both  fail. 

2.  AAX  <+  Xiff»bA'bXiffA'E(«b  X),  which  is  false. 

3.  <+  X  iff  ]X  h  X  iff  X  G  (]  X),  which  is  false. 


o 

These  results  about  <+  and  <“  may  be  suggestive.  In  terms  of  strong  entailment,  the  positive 
fragment  encodes  propositional  logic  with  minimal  negation  [Johansson,  1937],  which  is  like 
intuitionistic  negation  except  that  A  1  <+  0  is  not  valid.  Negative  strong  entailment  encodes 
the  dual  logic,  where  for  example  double-negation  elimination  is  valid  but  double-negation 
introduction  is  not.  ‘  And  in  terms  of  weak  entailment,  both  positive  and  negative  entailment 
collapse  to  classical  logic.  We  will  prove  all  of  these  facts  rigorously  in  Chapter  3. 

2.4  Linear  and  affine  canonical  derivations 

Although  we  have  used  linear  logic  notation  for  the  polarized  connectives,  they  are  non-linear  in 
the  sense  that  they  satisfy  some  non-linear  entailments.  In  particular,  the  following  entailments 
from  Proposition  2.3.12  may  seem  incongruous  with  the  notation: 

A®  B  <  A  A®  B  <  B  A<  A®  A 
A® A  <  A  A  <  A®B  B  <  A®B 

These  entailments  witness  the  structural  properties  of  weakening  ("hypotheses  can  be  thrown 
away")  and  contraction  ("hypotheses  can  be  reused").  As  it  turns  out,  if  we  look  back  at  the 
definition  of  canonical  derivations,  we  see  that  these  structural  properties  are  fairly  isolated. 
Hypothesis  reuse  can  only  occur  in  the  rule  for  asserting  a  concatenation  of  frames,  and  in  the 
rules  of  contradiction: 

•A+  £  T  T  h  A+  ThAi  T  h  A2  A~  g  T  T  h  »A~ 

Th#  Th(AllA2)  Th# 

3This  has  been  called  co-minimal  logic  by  Vakarelov  [2005]. 
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And  hypotheses  may  only  be  thrown  away  in  the  rules  for  satisfying  atomic  propositions  or  the 
empty  frame: 

•X~  G  r  X+  €  r 

r i- »x~  r h  •  rhi+ 

To  obtain  linear  canonical  derivations,  we  simply  replace  the  above  rules  with  the  following: 

Th  A+  Ti  F  A2  Ti  h  A2  TF.A~ 

T(.A+)b#  T1,r2h(A1,A2)  T(A-)  h# 


•A-  b  *A~  •  h  •  A+  h  X+ 

The  remaining  rules  of  proof  and  refutation  remain  unchanged.  We  can  similarly  obtain  affine 
canonical  derivations  by  only  replacing  the  first  three  rules.  When  we  want  to  distinguish 
ordinary  canonical  derivations  from  linear/ affine  canonical  derivations,  we  call  the  former  unre¬ 
stricted  canonical  derivations.  For  reference,  we  include  the  complete  definition  of  all  three  kinds 
of  canonical  derivations  in  Figure  2.9.  The  modified  notions  of  identity  and  composition  for 
linear  canonical  derivations  are: 

Identity  (proof).  Ah  A 

Identity  (refutation).  »A  F  »A 

Identity  (frame).  Ah  A 

Composition  (reduction).  IfT i  F  »A  and  T2  F  A  then  Ti,T2  F  ff 
Composition  (substitution).  IfT  i(A)  F  J  and  T2  F  A  then  Ti(T2)  F  J 

For  affine  canonical  derivations,  only  the  notions  of  composition  are  modified.  Again,  the 
derivations  witnessing  these  principles  are  completely  analogous  to  those  in  §2.1.4,  and  are 
well-founded  just  when  the  definition  ordering  is  well-founded.  Note  that  the  notion  of  pattern 
remains  unchanged  from  before,  and  so  the  definition  ordering  remains  unchanged  as  well. 

We  can  define  strong  and  weak  entailment  using  linear/ affine  canonical  derivations  just  as 
we  did  with  unrestricted  canonical  derivations  in  §2.3.3.  The  properties  of  these  entailment 
relations  are  only  slightly  different: 

Observation  2.4.1.  All  the  rules  of  strong  entailment  from  Proposition  2.3.12  hold  for  linear  canonical 
derivations,  except  for  the  following: 

A^A  <  A  A  <  A^B  B  <  A*8B 
A®  B  <  A  A®  B  <  B  A<A<g>A 

Observation  2.4.2.  All  the  rules  of  strong  entailment  from  Proposition  2.3.12  hold  for  affine  canonical 
derivations,  except  for  the  following: 

A^A  <  A  A  <  A  <g>  A 

Observation  2.4.3.  All  of  the  Galois  connections  of  Proposition  2.3.14  hold  for  both  linear  and  affine 
canonical  derivations. 
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Frames  A 

Contexts  F 


::=  »A  \  A  \  X+  \  »X~  \  ■  |  (Ai,  A2) 
"=  •  I  T,  A 


Proof  and  refutation 


A  IF  A+  T  h  A  A  IF  A+  — >  r,Ah# 

T  h  A+  rh.i+ 

A  IF  »A~  — >  r,Ah  #  A  IF  »A~  T  h  A 
r  b  .4  •  r  i-  »a~ 


Complex  hypotheses 


A  IF  A+  — *  T(A)  h  J 

r(^+)  h  j 


A  IF  mA-  — *  r(A)  I-  J 
r(.A-)  h  j 


With  weakening 


Weakening-free 


•X~  6  T  X+  G  r 

r  h  «a-  r  t-  •  r  h  a+  %x~  h  «a-  •  i-  •  x+h  x+ 


With  contraction 


•a+  e  r  r  h  a+ 
rh# 


rhi+ 
r(.A+)  h  # 


r  h  Ai  r  h  a2 
r  h  (Alt  a2) 


Contraction-free 


ri  h  Ai  r2  h  A2 
ri,r2h(A1,A2) 


a-  g  r  r  h 
rh# 

r  h 

r(A-)  h  # 


Figure  2.9:  Definition  of  linear,  affine,  and  unrestricted  canonical  derivations 
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Observation  2.4.4.  The  weak  entailments  1  <  A  ©  ^  A  and  A&->  A  <  _L  fail  for  both  linear  and  affine 
canonical  derivations,  while  the  rest  of  the  weak  entailments  in  Proposition  2.3.16  hold. 

Proof.  By  inspection  of  the  unrestricted  derivations  in  §2.3.3,  observing  that  none  make  essential 
use  of  weakening,  and  only  these  two  make  essential  use  of  contraction.  For  example,  in  the 
unrestricted  derivation  of  1  <“  A  ®  ^  A,  which  reduces  to  showing  »A  ®  ^  A  h  #,  we  must  use 
the  hypothesis  »A  ©  ^  A  twice.  O 


2.5  Related  Work 

The  literature  on  polarity,  proof  theory,  and  constructive  accounts  of  classical  logic  is  huge.  I 
touch  only  upon  some  of  the  more  closely  related  work. 

Polarity  in  classical  linear  logic.  There  is  a  very  old,  related  notion  of  polarity  in  logic  and 
proof  theory,  in  the  sense  of  positive  and  negative  occurrences  of  formulas  (see,  e.g.,  Herbrand 
[1930],  Kleene  [1967],  Schiitte  [1977]).  Polarity  as  a  property  of  connectives — the  notion  we  use 
here — was  introduced  by  Girard  [1991a,  1993]  as  a  way  of  recovering  constructive  content  from 
classical  logic,  and  as  an  attempt  at  understanding  some  general  properties  of  logic — classical, 
intuitionistic,  linear,  etc. — in  a  unified  way.  The  formal  treatment  of  polarity  given  here  (two 
syntactically  segregated  classes  of  connectives,  with  "shift  operators"  acting  as  intermediaries) 
is  a  simplification  of  the  original  approach,  first  described  in  a  note  by  Girard  [1991b]  and  taken 
up  in  his  more  recent  work  on  "ludics"  [2001].  This  approach  has  also  been  given  extensive 
treatment  in  Olivier  Laurent's  dissertation  [2002], 

Polarity  outside  of  classical  linear  logic.  Similar  formal  devices  have  appeared  elsewhere. 
For  example,  as  mentioned  in  the  Introduction,  Levy's  call-by-push-value  language  maintains 
a  syntactic  separation  between  value  types  and  computation  types,  with  coercions  between  them. 
Likewise,  the  Concurrent  Logical  Framework  [Watkins  et  al.,  2002]  maintains  a  separation  be¬ 
tween  synchronous  types  and  asynchronous  types.  Unlike  our  presentation  and  those  of  Girard  and 
Laurent,  in  both  these  settings  there  is  an  asymmetry  between  the  two  polarities,  which  makes 
them  somewhat  more  subtle.  They  are  nonetheless  polarities,  in  the  sense  that  they  describe  the 
bias  of  individual  connectives  towards  introduction  or  elimination — the  difference  is  just  that 
the  overall  framework  of  CBPV / CLF  has  an  asymmetry  between  introduction  and  elimination. 
(We  will  discuss  the  connection  with  CBPV  in  more  detail  in  Chapter  4.) 

Proof-theoretic  semantics.  At  the  start  of  the  chapter  and  in  the  Introduction,  I  gave  a 
paraphrase  of  Michael  Dummett's  attempt  at  finding  a  "justification"  of  the  logical  laws,  through 
an  explanation  of  the  meaning  of  the  logical  connectives.  The  idea  that  the  introduction  rules 
for  a  connective  somehow  determine  its  meaning  goes  back  to  an  offhand  remark  by  Gentzen, 
who  wrote  that  "an  introduction  rule  gives,  so  to  say,  a  definition  of  the  constant  in  question" 
[1935,  p.  80].  Without  a  direct  connection  to  structural  proof  theory,  this  idea  was  already 
explored  by  various  people  in  the  1930s,  particularly  Wittgenstein  ("It  is  what  is  regarded  as  the 
justification  of  an  assertion  that  constitutes  the  sense  of  the  assertion"  [Wittgenstein,  1974, 1,§40]), 
and  Brouwer-Heyting-Kolmogorov  in  their  interpretations  of  intuitionistic  logic  [Heyting,  1974, 
Kolmogorov,  1932],  Gentzen's  remark,  however,  was  first  explored  rigorously  by  Prawitz  [1974], 
by  treating  the  meaning  of  a  proposition  as  given  by  its  canonical  proofs,  or  verifications.  For 
Prawitz,  canonical  proof  meant  proof  ending  in  an  introduction  rule,  and  the  justification  of  an 
elimination  rule  consisted  of  a  local  reduction  step.  For  example,  from  the  introduction  rule  for 
conjunction. 
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A  B 
aab 


one  can  justify  the  two  elimination  rules, 

aab  Aab 

A  B 

by  the  following  reductions: 


A  B  A  B 

aab  :  Aab  \ 

A  A  B  ^  B 

These  reductions  show  how  to  derive  a  proof  of  the  conclusion  of  each  of  the  elimination  rules, 
given  a  canonical  proof  of  their  premise. 

Dummett  [1991]  built  upon  Prawitz's  intuition  in  several  ways.  First,  he  realized  that  by 
defining  a  more  restrictive  notion  of  canonical  proof — ending  in  a  series  of  introduction  rules, 
rather  than  just  a  single  one — he  could  then  have  a  more  expansive  justification  procedure — 
applicable  to  arbitrary  inferences,  rather  than  only  to  the  standard  elimination  rules.  Second, 
he  made  the  leap  of  considering  that  the  connectives  could  alternatively  be  defined  by  their 
elimination  rules,  which  would  then  justify  their  introduction  rules.4  For  these  combined  rea¬ 
sons,  Dummett 's  analysis  seems  to  me  to  have  had  great  foresight  in  prefiguring  the  concept  of 
polarity.  In  some  ways  it  is  actually  more  general  than  the  analysis  given  here,  since  Dummett 
based  the  pragmatist  meaning-theory  on  a  notion  of  canonical  consequence,  rather  than  canonical 
refutation.  As  I  explained  at  the  start  of  the  chapter,  I  have  chosen  to  present  here  a  symmetric 
view  of  proof  patterns  and  refutation  patterns  as  a  matter  of  expediency,  since  it  simplifies  the 
framework  while  still  conveying  the  basic  insight  of  polarity. 

Dummett  realized  that  certain  connectives  could  be  given  certain  interpretations  only  with 
difficulty.  For  example,  to  give  implication  and  universal  quantification  a  verificationist  interpre¬ 
tation,  he  had  to  significantly  weaken  the  notion  of  canonical  proof  (pp.  272-277).  However,  he 
did  not  take  the  step  of  suggesting  that  the  two  interpretations  could  define  different  connectives. 
He  still  required  "harmony  between  the  two  aspects  of  linguistic  practice"  (p.  287),  rather  than 
"diversity". 

JDummett  in  fact  attributes  this  idea  to  Martin-Lof,  who  he  says  "constructed  an  entire  meaning-theory  for  the 
language  of  mathematics  on  the  basis  of  the  assumption  that  it  is  the  elimination  rules  that  determine  meaning." 
This  is  likely  a  reference  to  Martin-Lot's  work  with  Peter  Hancock  [Hancock  and  Martin-Lof,  1975],  about  which 
Martin-Lof  wrote  to  Dummett  shortly  before  the  William  James  Lectures  [Martin-Lof,  1976].  For  example,  Martin-Lof 
wrote: 

To  explain  the  meaning  of  an  implication  Ad B,  we  must  explain  what  is  the  purpose  (function,  role)  of 
a  canonical  proof  of  A  D  B.  And,  specializing  the  explanation  given  above  [for  the  dependent  function 
space],  this  purpose  is  to  be  applied  to  a  canonical  proof  of  the  proposition  denoted  by  A,  thereby 
yielding  a  canonical  proof  of  the  proposition  denoted  by  B.  In  no  way  is  it  correct  to  say  that  the 
meaning  of  A  D  B  is  determined  by  the  introduction  rule. 

On  the  other  hand,  Hancock /Martin-Lof  never  discuss  the  role  of  canonical  consequence  in  justifying  the  introduction 
rules.  In  the  1983  Siena  Lectures,  Martin-Lof  seems  to  explicitly  adopt  a  verificationist  stance — "The  meaning  of  a 
proposition  is  determined  by...  what  counts  as  a  verification  of  it"  (Lecture  3) — and  he  explains  the  meaning  of 
implication  in  terms  of  its  introduction  rule.  However,  there  is  no  real  contradiction  between  the  two  positions, 
because  the  explanation  of  implication  given  in  the  Siena  lectures  reduces  its  meaning  to  that  of  the  hypothetical 
judgment,  which  is  explained  in  terms  of  elimination,  i.e.,  in  terms  of  substitution  for  hypotheses. 
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Schroeder-Heister  has  used  the  phrase  proof-theoretic  semantics  for  these  attempts  at  finding 
meaning  for  the  logical  connectives  entirely  within  the  logical  rules  [Kahle  and  Schroeder-Heister, 
2006].  Girard's  recent  work  [1998,  2001,  2006]  can  also  be  seen  in  these  terms,  as  trying  to  give 
an  "internal"  semantics  of  proofs.  His  attempt,  however,  draws  on  many  additional  concepts 
from  games  semantics. 

Game-theoretic  semantics.  Lorenzen  [1960,  1961]  and  Henkin  [1961]  first  explored  the  idea 
of  treating  the  truth  or  falsehood  of  a  proposition  as  the  result  of  a  game  between  Proponent 
and  Opponent.  Henkin's  work  was  later  built  upon  by  Hintikka  [1973],  while  Lorenzen's  was 
revisited  in  the  light  of  linear  logic  by  Blass  [1992],  Games  semantics  for  linear  logic  has  been  a 
very  active  topic  of  research  since  Blass's  original  paper.  Although  there  has  always  been  some 
tension  between  the  desires  of  games  semantics  and  the  demands  of  linear  logic  (see  the  paper 
by  Mellies  and  Tabareau  [2007,  2008]  for  a  discussion),  the  basic  framework  is  very  compelling. 
Polarity  has  a  simple  interpretation:  if  a  proposition  describes  a  game  between  Player  and 
Opponent,  then  polarity  says  who  gets  to  make  the  first  move  [Laurent,  2004b],  Likewise, 
negation  has  a  very  elegant  definition:  it  simply  swaps  the  roles  of  Player  and  Opponent.  But 
this  means  that  negation  is  always  an  involution,  which  is  at  odds  with  the  goal  of  using 
games  semantics  to  model  mainstream  functional  programming  languages,  and  negations  as 
continuations.  Mellies  and  Tabareau  therefore  propose  a  new  direction  for  games  semantics, 
with  non-involutive  negation  playing  the  central  role. 

With  canonical  derivations,  we  have  seen  that  there  is  space  for  many  different  kinds  of 
negations.  At  the  most  basic  level,  there  is  negation  at  the  level  of  judgments,  •  A  The  question 
of  whether  it  is  involutive  does  not  really  make  sense,  because  it  cannot  be  iterated.  However, 
as  we  saw  in  terms  of  the  differences  between  strong  and  weak  entailment,  for  propositions  of  a 
given  polarity,  there  is  a  fundamental  asymmetry  between  assertion  A  and  denial  •  A  But  then 
there  is  also  negation  at  the  level  of  logical  connectives,  and  we  found  that  there  are  many:  non- 
involutive  negations  ->A  of  both  polarities,  and  the  polarity-reversing  negations  A  =  A  — >  _L 
and  B1  =  1  —  B,  which  together  form  an  involution. 

Assertion  and  denial.  We  gave  refutation  a  first-class  status,  distinct  from  the  proof  of  a 
negation.  Such  analyses  have  been  used  before  in  trying  to  understand  the  proof  theory  of 
classical  logic,  as  well  as  to  make  sense  of  different  paraconsistent  logics.  Smiley's  article  [1996] 
represents  one  such  analysis,  as  does  Stewart's  analysis  of  classical  natural  deduction  [1999],  and 
Restall's  of  multiple  conclusion  sequent  calculus  [2005].  (Our  notation  A  and  »A  for  assertion 
and  denial  is  borrowed  from  Stewart.)  Beilin  and  Biasi  [2004]  also  draw  a  similar  analogy  to  the 
one  we  made,  connecting  the  duality  between  assertion  and  denial  (or  "conjecture",  as  they  put 
it)  and  the  duality  between  positive  and  negative  polarity. 

Display  Logic.  In  addition  to  assertion  and  denial,  our  description  of  canonical  derivations 
relied  crucially  on  the  notion  of  frame.  In  a  sense,  frames  can  be  seen  as  turning  the  comma 
into  a  connective  on  judgments,  appearing  both  to  the  left  and  to  the  right  of  the  turnstile.  (Note 
this  is  different  from  the  comma  in  Gentzen's  multiple  conclusion  sequent  calculus,  which  means 
different  things  on  the  left  and  on  the  right.)  We  made  it  almost  a  first-class  connective  by  allowing 
complex  frames,  though  not  completely  first-class  because  we  still  forbade  the  contradiction 
judgment  #  in  frames.  It  seems  there  is  an  analogy  to  be  drawn  with  Belnap's  display  logic 
[Belnap,  1982],  which  is  also  formulated  in  terms  of  first-class  "structural  connectives".  As 
Restall  [1995,  1998]  has  observed,  these  structural  connectives  can  be  assigned  polarities.  Galois 
connections  of  the  sort  we  described  in  §2.3.3  also  play  an  important  role,  through  the  link 
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to  Dunn's  "gaggle  theory"  [Dunn,  1991].  And  there  is  a  generic  proof  of  cut  admissibility  for 
display  logic,  given  properties  of  the  structural  rules  [Belnap,  1982,  Dawson  and  Gore,  2002], 
These  similarities  may  hint  at  a  deeper  connection. 

Infinitary  proof  theory  and  iterated  inductive  definitions.  As  mentioned  in  Footnote  1,  our 
rule  of  refutation  for  positive  propositions,  as  well  as  our  rule  of  proof  for  negative  propositions, 
bears  a  striking  formal  resemblance  to  the  Buchholz  Q-rule  for  iterated  inductive  definitions 
[Buchholz  et  al.,  1981].  Buchholz's  rule  was  a  powerful  generalization  of  the  so-called  ca-rule 
for  first-order  arithmetic  ("derive  Vn.A(n)  given  proofs  of  ,4(0).  .4(1).  ,4(2). suggested  by 
Hilbert  [1931]  and  studied  by  Novikov  [1943],  Schiitte  [1950],  and  Lorenzen  [1951].  We  have 
stayed  clear  of  infinity  in  this  chapter,  but  will  embrace  it  wholeheartedly  in  Chapter  4,  with 
recursive  types  that  have  both  infinitely  many  patterns,  and  infinitely  descending  definition 
trees.  The  elegance  of  the  approach  based  on  infinitary  proof  theory  is  that  the  procedure  for 
composition/ cut-elimination  is  unaffected  by  whether  or  not  types  are  infinite,  only  the  argument 
about  whether  or  not  it  terminates. 
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Chapter  3 

Focusing  proofs  and  double-negation 
translations 


During  a  lecture  the  Oxford  linguistic  philosopher  J.  L.  Austin  made  the  claim  that  al¬ 
though  a  double  negative  in  English  implies  a  positive  meaning,  there  is  no  language 
in  which  a  double  positive  implies  a  negative.  To  which  Morgenbesser  responded  in 
a  dismissive  tone,  "Yeah,  yeah." 


— Wikipedia  entry  for  Sidney  Morgenbesser 


As  already  acknowledged,  the  preceding  chapter  was  a  bit  of  revisionist  history.  I  described 
how  a  certain  notion  of  proof  and  refutation  arises  naturally  by  considering  connectives  to  be 
defined  by  patterns:  either  proof  patterns  (positive  connectives),  or  refutation  patterns  (negative 
connectives).  I  called  these  canonical  derivations,  and  gave  a  presentation  roughly  in  the  style 
of  Martin-Lof — in  the  sense  of  distinguishing  multiple  forms  of  judgments — and  fundamentally 
an  iterated  inductive  definition,  including  rules  structurally  very  similar  to  Buchholz's  17-rule. 

In  this  chapter,  I  will  explain  how  canonical  derivations  can  also  be  seen  as  an  alternative 
presentation  of  Jean-Marc  Andreoli's  focusing  strategies  for  sequent  calculus.  This,  of  course,  is  not 
simply  a  remarkable  coincidence,  because  the  system  I  presented  was  derived  backwards,  with 
polarity  and  focusing  already  in  mind.  My  reason  for  delaying  the  discussion  of  focusing  is  in 
part  technical:  the  sequent  calculus  notation  makes  a  number  of  unnecessary  distinctions,  which 
leads  to  more  cumbersome  inference  rules,  identity  and  composition  principles.  But  there  is  also 
a  conceptual  reason.  At  least  at  first  glance,  focusing  seems  to  be  an  extra  layer  of  complexity 
grafted  onto  the  sequent  calculus  to  obtain  a  better  proof  search  procedure.  Sequent  calculus 
comes  first,  and  focusing  is  secondary.  But  this  conceptual  order  is  backwards,  I  would  argue: 
focusing  proofs  are  simpler  and  more  basic  than  ordinary  sequent  calculus  proofs,  precisely 
because  they  have  a  natural  interpretation  as  canonical  forms  of  proof  and  refutation.  At  the 
end  of  the  chapter,  we  will  see  how  this  correspondence  gives  us  a  way  of  understanding  focusing 
as  a  sort  of  double-negation  interpretation,  and  indeed  a  recipe  for  reconstructing  many  different 
double-negation  translations  of  classical  logic  into  minimal  logic. 
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Ah  A 


init 


£ib91i,A  A,£2b91  2 


£b91  A,f?,£l-<ft  £2b912,B  _ 

l,£b91  A©B,£b910  £i,£2  b  91i,912,A©B  0  -bl 

A£b*H  fi,£b!R  £b9t,A  £b91,B  ^ 

A&B,£b91  &  1  A&B,£b  91  &  2  £  b  91,  A&B  &i?  £b9t,T  Ti? 

^4,  £  b  £H  B,£b9t  £b91,A  £b91,B 

0,£b  91  0L  A©B,£b  91  0L  £b  91,AffiB  0i?1  £b9t,A©B  0jR2 

_  ^,£!b9t!  5,£2  b  SHo  £  b  91,  A,  B  £b9t 

Tb-  A>S>B,£i,£2  b  9ti,9t2  V  £b91,A>S>B^  £b91,T  " 


Figure  3.1:  Sequent  calculus  for  multiplicative-additive  linear  logic 


3.1  Focusing  proof  search  for  linear  logic 

Focusing  was  originally  discovered  in  the  context  of  linear  logic  [Girard,  1987,  Andreoli,  1992], 
and  is  most  vividly  illustrated  there.  A  standard,  two-sided  presentation  of  multiplicative- 
additive  linear  logic  (MALL)  is  given  in  Figure  3.1.  Sequents  are  treated  modulo  reordering 
of  formulas,  so  that  the  structural  property  of  exchange  is  implicit.  The  structural  properties  of 
weakening  and  contraction  are  explicitly  omitted.  We  write  £  b^  91  to  indicate  that  the  sequent 
£  b  91  is  derivable  from  these  inference  rules.  To  begin  we  will  give  a  simple-minded  proof 
search  algorithm  as  a  decision  procedure  for  hg,  and  then  see  how  to  refine  this  procedure 
through  focusing. 

3.1.1  Naive  proof  search  for  linear  logic 

The  naive  procedure  relies  only  on  a  few  facts  about  MALL,  which  we  state  without  proof. 

Definition  3.1.1.  We  say  that  B  is  an  immediate  syntactic  subformula  of  A  (written  B  <  A)  if 
A  =  ©(.Bi, . . . ,  Bn)  for  some  n-ary  connective  0,  and  B  =  Bj  for  some  i. 

We  extend  the  syntactic  subformula  ordering  to  a  multiset  ordering  on  sequents:  we  say  that 
the  sequent  £'  b  91'  is  strictly  smaller  than  £  b  91  (written  £'  b  91'  <  £  b  91),  if  the  formulas  of 
£'  ©  91'  are  obtained  by  removing  some  formula  .4  e  £  l+J  91  and  replacing  it  by  a  list  A±, . . . ,  An 
of  immediate  syntactic  subformulas. 

Proposition  3.1.2.  The  ordering  <  on  sequents  is  well-founded. 

Definition  3.1.3.  The  L-rules  and  R-rules  are  called  logical  rules.  The  unique  formula  introduced 
on  the  left  or  right  in  the  conclusion  of  a  logical  ride  is  called  its  principal  formula.  The  syntactic 
subformulas  of  the  principal  formula  appearing  in  the  premises  of  a  logical  ride  are  called  active  formulas. 
The  remaining  formulas  carried  through  in  £  and  91  are  called  the  context.  We  sometimes  write  the 
context  as  a  sequent,  i.e.,  the  context  of  the  formula  A  in  the  sequent  £  b  91,  A  (or  A,  £  b  91)  is  written 
as  £  b  91. 
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Observation  3.1.4.  In  every  logical  rule,  the  premises  are  strictly  smaller  than  the  conclusion. 

Theorem  3.1.5  (Init-elimination).  Any  derivation  that  uses  (; init )  can  he  converted  to  one  where  ( init ) 
is  restricted  to  atomic  formulas. 

Theorem  3.1.6  (Cut-elimination).  Any  derivation  that  uses  (cut)  can  he  converted  to  one  that  doesn't. 

Corollary  3.1.7.  If  £  91,  then  there  is  a  derivation  of  £  b  91  using  only  logical  rides  and  the  (init) 

rule  restricted  to  atomic  formulas. 

Now,  consider  the  following  simple  decision  procedure  for  MALL  sequents  (and  as  a  special 
case,  MALL  formulas),  which  attempts  to  build  a  proof  "backwards",  i.e.,  starting  from  the  goal 
£  b  91  as  the  root,  and  trying  to  build  up  a  proof  tree: 

1.  Find  a  logical  rule  whose  conclusion  matches  the  goal  sequent  £  h  93,  and  recursively  try 
to  prove  each  premise  as  a  goal.  If  the  rule  has  no  premises  (1 R,  T R,  0 L,  _L L)  then  the 
sequent  is  provable  and  we  are  done.  Note  that  some  rules  (<g )R,  *8L)  can  be  applied  in 
multiple  ways,  by  choosing  different  splittings  of  the  context. 

2.  Suppose  the  sequent  does  not  fit  the  conclusion  of  any  logical  rule:  if  it  is  an  atomic  initial 
sequent  X  \-  X  then  we  can  apply  (init)  to  complete  the  proof;  otherwise,  the  sequent  is 
unprovable,  and  we  must  backtrack  to  one  of  our  earlier  goals  in  (1),  and  try  to  prove  it 
by  different  means  (i.e.,  with  a  different  rule,  or  a  different  splitting  of  the  context). 

Corollary  3.1.7  implies  that  if  £  \~i  91,  then  this  procedure  will  always  find  a  proof  given  an 
oracle  for  step  (1).  The  combination  of  Observation  3.1.4  and  Proposition  3.1.2,  together  with 
the  fact  that  there  are  only  finitely  many  logical  rules  and  finitely  many  splittings  of  a  context, 
implies  that  the  oracle  is  superfluous,  and  the  procedure  will  always  terminate  with  either  a 
derivation  of  £  F  91  or  the  knowledge  that  it  is  unprovable. 

That  said,  the  procedure  is  wildly  inefficient.  The  source  of  this  inefficiency  is  the  large 
number  of  potential  rules/context-splittings  that  must  be  tried  in  step  (1).  The  context-splitting 
problem,  which  we  call  multiplicative  nondeterminism,  can  be  mitigated  by  a  range  of  techniques 
(falling  under  the  corporate-sounding  title  "resource  management"),  and  is  in  any  case  peculiar 
to  the  rules  (<g>f?)  and  (’&L) — let  us  put  it  aside,  and  concentrate  on  the  problem  of  picking  a  rule. 
We  can  potentially  choose  any  non-atomic  formula  in  the  goal  sequent  as  the  principal  formula 
of  a  logical  rule,  and  then  possibly  choose  between  multiple  rules  for  that  formula  (if  the  formula 
is  A  0  B  on  the  right  of  the  sequent,  or  A&B  on  the  left).  Again,  let  us  put  aside  the  latter 
source  of  nondeterminism — we  call  it  additive  nondeterminism 1 — and  concentrate  on  the  act  of 
choosing  some  formula  in  the  sequent  to  be  the  principal  formula.  Suppose  we  make  the  wrong 
choice  and  the  new  goals  are  unprovable,  a  priori  that  does  not  tell  us  that  the  original  goal  is 
unprovable,  and  we  must  go  back  and  try  a  different  formula.  As  a  simple  example,  suppose  we 
are  trying  to  prove  X  ©  Y  F  X  ©  Y ,  and  begin  by  choosing  the  formula  on  the  right.  Whether 
we  apply  (©i?i)  or  (Qlh),  the  new  goal  will  be  unprovable — and  yet  the  original  sequent  is 
provable,  if  we  begin  by  choosing  the  formula  on  the  left  and  applying  (®L).  In  general,  then,  it 
seems  we  must  backtrack  and  try  every  possible  non-atomic  formula  in  a  sequent  as  the  principal 
formula,  if  we  want  to  be  guaranteed  of  either  finding  a  proof  or  establishing  that  the  sequent 

1  Girard  calls  <g>  and  'S’  multiplicative,  &  and  ©  additive,  hence  this  terminology — although  additive  nondetermin¬ 
ism  is  "multiplicative"  in  the  sense  that  the  search  space  multiplies  as  we  work  up  the  proof  tree. 
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is  unprovable. . .  but  this  impression  turns  out  to  be  mostly  mistaken!  Andreoli's  focusing  proof 
search  consists  of  two  observations  that  whittle  down  much  of  the  nondeterminism  in  picking 
a  principal  formula. 


3.1.2  Observation  #1:  Invertibility  and  the  Inversion  Phase 


The  first  observation  is  simple:  many  of  the  rules  are  invertible  (recall,  a  rule  is  invertible  if  its 
conclusion  implies  its  premises).  In  particular,  all  of  the  following  left  rules  are  invertible: 


A,£b91  B,£  h!R 
0,£b  ih  0L  ®L 


£  h  91 

i,£b  93 


1 L 


A,  f?,£  b  93 
A®B,£  b  93  0L 


as  are  their  dual  right  rules: 

£b93,A  £h  93,  B 

£b93,TTi?  £b  V\,A&B  &R 


£b93 


YR 


£b  93,  A,  B 

- - — - —  >$?/? 

£h  %A^B  * 


We  say  that  the  connectives  ©,  ©,  1, 0  are  left-invertible,  while  &,  *8,  T,  _L  are  right-invertible.  When 
an  invertible  rule  is  applied  during  step  (1)  of  proof  search,  there  is  no  need  for  backtracking  if 
the  new  goals  fail:  that  the  premises  are  unprovable  is  sufficient  evidence  that  the  conclusion 
is  unprovable.  In  other  words,  if  rules  are  read  bottom-up  as  goal  transformers  (taking  a  goal  to 
a  set  of  new  goals),  an  invertible  rule  is  a  "safe"  transformation  in  the  sense  that  it  preserves 
provability.  We  can  take  this  a  bit  further  by  building  in  an  inversion  phase. 

Definition  3.1.8.  A  formula  inside  a  sequent  (or  to  be  more  precise,  an  occurrence  of  a  formula)  is  in¬ 
vertible  if  it  matches  the  conclusion  of  an  invertible  logical  rule,  and  stable  otherwise.  A  sequent/context 
is  invertible  if  it  contains  at  least  one  invertible  formula,  and  stable  if  it  contains  only  stable  formulas. 

During  the  inversion  phase  of  proof  search,  we  greedily  apply  invertible  rules  as  goal  trans¬ 
formers,  until  we  are  left  with  a  set  of  stable  sequents  as  goals.  Note  that  the  order  in  which 
we  pick  different  invertible  formulas  in  the  sequent  to  invert  is  irrelevant,  not  only  with  respect 
to  provability  but  also  with  respect  to  the  ultimate  set  of  stable  sequents.  For  example,  when 
inverting  (A1&A2)  0  A  b  Y &( B\  ©  B2),  whether  we  first  decompose  the  ©  on  the  left,  or  the  & 
on  the  right,  we  are  left  with  the  same  two  stable  sequents: 

Ai&A2,X\-  F  AiS<A2,X  b  B\  ©  B2  n  A±S<A2,X\-  F  Ai&iA2,X\-  B\  ©  B2 

A1&A2,XhY&(B1®B2)  (A&A2)®IhF  0  ® 

(a1&a2)  <s>  a  f  y&(bx  ©  b2)  0  (Ai&a2)  <g>  x  b  y&(Bi  ©  b2) 

More  generally,  we  can  view  inversion  as  replacing  a  single  formula  A  in  context  £  b  93  with  a 
unique  set  of  stable  contexts  £,  b  93,,  i  =  l..n,  deriving  the  new  set  of  goals  £,;,  £  b  93. 93/ .  Since 
the  original  context  is  carried  through  unchanged,  it  doesn't  matter  in  which  order  we  examine 
the  different  invertible  formulas  in  a  sequent.  And  since  the  result  of  inversion  is  unique,  we 
can  view  the  inversion  phase  as  operating  in  one  big  deterministic  step.* 2 

2The  reader  may  have  noticed  we  left  out  (1 R)  and  (-LL)  from  the  list  of  invertible  logical  rules,  although  they  are 
trivially  invertible  since  they  have  no  premises.  This  omission  can  be  understood  if  we  read  the  rules  as  imposing  a 
side  condition  on  the  context: 

£  =  •  93  =  ■  £  =  •  93  =  ■ 

£  b  93, 1  J_,£b93 

In  terms  of  proof  search,  the  presence  of  1  on  the  right  or  _L  on  the  left  cannot  be  applied  greedily,  because  they 
force  the  rest  of  the  context  to  be  empty. 
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3.1.3  Observation  #2:  Focalization  and  the  Focus  Phase 


The  second  observation  is  much  less  obvious,  although  it  turns  out  to  be  the  dual  of  the  first. 
Consider  the  remaining  right  rules: 


£b  <R,A  £hm,B 

£b  <R,A®B  ®Rl  £  h  V\,A®B  0i?2 


£11-91!,  A  £2h  X2,B 
£i,£2  I-  ,A®B  0 


•  h  1 


1 R 


and  dual  left  rules: 


A&B,£hV\  &Ll 


B,£h9\  or 
A&B,£hm  &02 


A,  £i  h  JHi  B,  £2  h  912 
A^B,£ i,£2  I- Sli.iHa 


_L  h  • 


LL 


Suppose  we  have  a  stable  sequent  with  non-atomic  formulas.  To  build  a  potential  proof  of 
this  sequent,  we  must  begin  (backwards)  by  applying  one  of  these  rules.  But  where  do  we  go 
after  that?  First,  pay  attention  to  the  fact  that  in  all  of  these  rules,  each  premise  has  exactly 
one  active  formula  (cf.  Definition  3.1.3).  Andreoli's  observation  was  the  following:  after  we 
(nondeterministically)  choose  some  formula  to  be  principal,  and  then  apply  some  logical  rule  on 
that  formula  (absorbing  any  multiplicative /additive  nondeterminism),  it  is  sufficient  to  take  the 
active  formula  in  each  premise  of  the  rule  as  the  principal  formula  of  the  next  logical  rule — and 
pick  the  active  formula  in  each  of  its  premises  as  the  next  principal  formula,  etc.  Andreoli  calls 
this  part  of  proof  search  the  focusing  phase,  since  we  are  always  focused  on  a  particular  formula 
in  a  sequent.  The  focusing  phase  ends  once  we  reach  either  an  invertible  premise,  or  an  atom. 
Let  us  make  this  slightly  more  precise: 

Definition  3.1.9.  We  say  that  proof  search  has  entered  the  focus  phase/is  focused  on  a  particular 
formula  inside  a  stable  sequent ,  if  we  have  committed  to  using  that  formula  as  the  principal  formula  of 
the  first  logical  rule  (i.e.,  at  the  root  of  the  proof),  and  to  maintaining  focus  on  the  unique  active  formula 
in  each  premise  of  that  rule,  unless  that  formula  is  invertible  or  atomic.  Prior  to  such  a  commitment,  zve 
say  that  proof  search  is  in  the  neutral  phase. 

We  can  make  a  few  observations  about  this  definition: 


•  The  focus  formula  can  be  either  on  the  left  or  right  of  the  sequent:  left  if  its  outermost 
connective  is  among  &,  *9,  T,  _L,  right  if  its  outermost  connective  is  among  <g>,  0, 1,0. 

•  The  definition  does  not  completely  specify  the  treatment  of  atoms,  which  is  flexible:  we  can 
end  the  focus  phase  either  by  completing  the  proof  with  an  atomic  initial  sequent,  or  by 
going  back  to  the  neutral  phase.  However,  for  any  particular  atom,  we  must  be  consistent 
about  this  choice,  either  always  using  an  initial  sequent  when  the  atom  is  in  right-focus, 
or  always  when  it  is  in  left-focus. 

•  If  the  focus  phase  ends  by  going  back  to  an  inversion  phase,  there  is  exactly  one  formula 
to  invert. 


Focusing  proof  search  consists  of  the  entire  cycle,  starting  from  an  inversion  phase,  moving  to  the 
neutral  and  then  the  focus  phase,  and  then  either  completing  the  proof  with  an  initial  sequent  or 
going  back  to  an  inversion  phase.  What  is  remarkable  is  that  this  search  strategy  is  complete,  i.e., 
it  will  always  find  a  proof  if  one  exists.  Since  the  inversion  phase  obviously  preserves  provability, 
what  remains  to  show  completeness  is  the  focalization  lemma. 
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Lemma  3.1.10  (Localization  [Andreoli,  1992]).  Any  provable  stable  sequent  has  a  proof  that  begins  by 
focusing  on  some  formula. 

Corollary  3.1.11.  Focusing  proof  search  is  complete. 

Localization  is  not,  prima  facie,  an  obvious  property.  It  implies  that  once  we  have  chosen  the 
principal  formula  of  the  first  logical  rule,  we  do  not  need  to  make  any  more  choices  about 
principal  formulas  until  we  get  back  to  (and  complete)  an  inversion  phase.  What  seemed  like  a 
hopeless  amount  of  nondeterminism  in  the  naive  proof  search  algorithm  of  §3.1.1  can  actually 
be  whittled  down  to  the  following  in  focusing  proof  search: 

•  a  nondeterministic  transition  from  the  neutral  phase  to  the  focus  phase,  picking  a  focus 
formula 

•  any  multiplicative/ additive  nondeterminism  incurred  during  the  focus  phase 

This  is  really  a  difference  of  orders  of  magnitude,  and  so  focusing  is  very  important  for  efficient 
backwards  proof  search  in  linear  logic — but  that  is  selling  it  short.  Lor  one,  focusing  is  also 
essential  to  practical  "forward"  search  procedures  for  linear  logic  (where  we  try  to  work  down 
from  axioms  to  the  goal).1  The  wider  significance  of  focusing  proof  search,  though,  is  the  effect 
it  has  on  proofs.  We  will  explain  this  contention  by  describing  the  close  correspondence  between 
MALL  focusing  proofs  and  linear  canonical  derivations. 


3.2  Relating  focusing  proofs  to  canonical  derivations 

3.2.1  Polarity,  invertibility,  and  focalization 

We  can  see  that  the  MALL  connectives  divide  very  neatly  according  to  their  focusing  strategy: 
<8>,©,1,0  are  inverted  on  the  left  and  focused  on  the  right,  while  &,>S>,T,_L  are  inverted  on 
the  right  and  focused  on  the  left.  This  tentatively  suggests  the  following  relationship  between 
connectives'  focusing  behavior  and  the  notion  of  polarity  used  in  Chapter  2: 

positive  ~  invert-left/focus-right  negative  ~  invert-right /focus-left 

This  correspondence  might  seem  a  bit  strange  conceptually,  though.  In  Chapter  2,  we  explained 
polarity  as  a  way  of  defining  the  propositional  connectives,  either  in  terms  of  proof  or  in  terms 
of  refutation.  Here  we  have  worked  backwards,  starting  with  the  sequent  calculus  for  linear 
logic — which  already  distinguishes  between  different  forms  of  conjunction  and  disjunction — and 
showing  how  to  derive  Andreoli's  focusing  algorithm  by  reasoning  about  the  invertibility  and 
focalizability  of  rules.  Whereas  before  we  saw  positive  and  negative  polarity  as  two  legitimate 
alternatives,  here  it  appears  there  is  no  room  for  choice  about  the  focusing  strategy. 

But  that  is  not  entirely  correct.  Lor  one,  we  already  saw  some  flexibility  in  the  treatment 
of  atoms — and  even  for  logical  connectives,  focusing  behavior  cannot  always  be  completely 
determined  by  sequent  calculus  rules.  Consider  negation,  and  the  (seemingly  trivial)  shift: 

£L93,A  A,£  b  93 

^A,£b93  £b9t,-.A 

A,£bi?  £h!R,T 

|A,£b  93  £  b  93,  | A 

3See  [Chaudhuri  and  Pfenning,  2005]. 
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In  typical  presentations  of  linear  logic,  negation  is  defined  as  a  syntactic  operation  (— )_L  on  for¬ 
mulas,  rather  than  through  left  and  right  rules — and  we  can  see  one  reason  why:  its  introduction 
rules  do  not  fit  the  pattern  above,  since  both  are  invertible.  Likewise,  both  left  and  right  rules  for 
|  are  invertible.  The  fact  that  every  rule  is  invertible  means  we  are  not  forced  into  adopting  a 
particular  focusing  strategy  for  ~>A  and  \A,  and  either  of  the  following  strategies  for  navigating 
between  focus  and  inversion  phases  are  sensible: 

1.  Always  remain  in  the  same  phase  (i.e.,  keep  A  in  focus  if  the  conclusion  is  in  focus,  invert 
A  if  the  conclusion  is  being  inverted),  or 

2.  Always  switch  phases  (i.e.,  invert  A  if  the  conclusion  is  in  focus,  stop  inverting  A  if  the 
conclusion  is  being  inverted) 

We  will  adopt  strategy  (2)  for  both.  This  corresponds  to  declaring  that  ->  preserves  polarity, 
while  1  reverses  polarity:  the  rules  for  ~>A  end  the  focus /inversion  phase  because  A  has  the 
same  polarity  but  is  moved  to  the  opposite  side  of  the  sequent,  while  the  rules  for  \A  end 
focus /inversion  because  they  retain  A  on  the  same  side  of  the  sequent  although  it  has  opposite 
polarity.  On  the  other  hand,  there  is  nothing  about  the  sequent  calculus  rules  that  forces  us  into 
this  policy.  For  example,  we  could  adopt  strategy  (1)  for  negation — but  then  we  should  call  it 
by  a  different  name. 

£h  9t,A  A,£h  m 
A±,  £  F  5K  £F  9\,AX 

The  negation  .4  is  defined  by  the  same  introduction  rules  as  ~<A,  but  declaring  that  A  has  the 
opposite  polarity  of  A  corresponds  to  adopting  focusing  behavior  (1)  rather  than  (2). 

These  examples  show  that  focusing  behavior  cannot  always  be  inferred  by  looking  at  the 
sequent  calculus  rules.  But  we  can  make  it  explicit,  just  as  we  made  polarity  explicit  in  Chapter  2. 
We  will  again  adopt  our  conventions: 

1.  Every  formula  (including  atoms)  has  definite  positive  or  negative  polarity,  indicated  A+  or 
A- 

2.  Every  connective  combines  formulas  of  a  specific  polarity  to  produce  a  formula  of  specific 
polarity. 

In  particular,  the  connectives  (g),  0, 1,  0  combine  positive  formulas  into  a  positive  formula,  while 
&,  >??.  T,  _L  combine  negative  formulas  into  a  negative  formula.  As  in  §2.3.1,  we  distinguish 
different  versions  of  J A  and  -i A,  based  on  the  polarity  of  A:  we  write  j A  and  T  A  when  A  is 
positive  (the  results  are  respectively  negative  or  positive),  and  [A  and  ^  A  when  A  is  negative 
(the  results  are  respectively  positive  or  negative).  Finally,  we  add  implication  A  — >  B  (usually 
written  A  — o  B  in  MALL)  and  subtraction  A  —  B  with  their  standard  rules: 

A,£F91  £i  L  A,  9Tl  B,£2hW, 2  A,£h9f  £ihA,9h  B,£2  h 

£  h  91,  A  —>  B  A^B,£  i,£2  h  SKi,  JK2  A-B,£hm  £i,£2  b  9L  ,9\2,A-B 

And  we  adopt  the  convention  that  A  — >  B  (resp.  A  —  B)  is  negative  (resp.  positive)  if  A  is 
positive  and  B  negative.  Note  that  certain  formulas  of  MALL  are  not  well-polarized  according 
to  these  conventions — for  example  l&l,  because  &  only  applies  to  negative  formulas.  However, 
to  produce  a  logically  equivalent  well-polarized  formula  we  can  simply  insert  shifts,  e.g.,  |l&jT. 
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Under  these  conventions,  technically  we  are  working  with  what  Olivier  Laurent  calls  MALLP, 
or  polarized  MALL.  However,  since  the  difference  between  MALL  and  MALLP  is  very  minor, 
we  will  keep  referring  to  it  as  MALL,  implicitly  assuming  the  polarity  conventions. 

3.2.2  Focusing  proofs,  through  a  microscope 

In  §3.2,  we  explained  focusing  informally  as  a  search  procedure  over  a  subset  of  all  MALL 
sequent  calculus  proofs,  which  are  called  th e  focusing  proofs.  An  alternative,  more  structural  way 
of  viewing  focusing  is  that  it  is  itself  defined  by  a  sequent  calculus  that  maintains  "punctuation" 
to  distinguish  between  the  inversion,  neutral,  and  focus  phases.  Andreoli  defined  such  a  sequent 
calculus  (S3)  in  his  original  paper,  and  in  this  section  we  will  consider  a  very  similar  presentation. 
We  begin  by  applying  the  polarity  discipline  to  give  a  more  precise  characterization  of  contexts. 

Recall  (Definition  3.1.8)  that  a  stable  context  £  b  91  cannot  contain  any  invertible  formulas. 
Thus  the  left  half  can  only  contain  negative  formulas  or  positive  atoms  (non-atomic  positive 
formulas  are  left-invertible),  and  the  right  half  only  positive  formulas  or  negative  atoms  (non- 
atomic  negative  formulas  are  right-invertible).  We  will  use  bold-face  letters  L  and  R  to  range 
over  these  stable  halves: 


L  ::=  •  |  L,  A~  |  L,  X+ 

R  ::=  •  |  R,  A+  |  R,  X~ 

An  invertible  context,  on  the  other  hand,  can  contain  positive  formulas  on  the  left  or  negative 
formulas  on  the  right.  We  use  italics  letters  L  and  R  to  range  over  these  invertible  halves: 

L  ::=  -|  L,A+ 

R  ::=  •  |  R,A~ 

Note  we  don't  force  A  to  be  non-atomic,  but  as  part  of  the  inversion  phase  we  will  transfer  any 
atoms  X+  from  L  to  L,  and  any  X~  from  II  to  R.  Now,  let  us  reformulate  the  focusing  algorithm 
by  giving  inference  rules  for  deriving 

invertible  sequents  L;  L  h  R;  R 
neutral  sequents  LhR 

and  focused  sequents  LhR;  [A+]  or  [A~];  LhR. 

To  prove  an  invertible  sequent  we  apply  a  series  of  invertible  logical  rules,  for  example: 

A,  B,  L]  L  h  R;  R  L;L  h  R;  R,  A  L;L  h  R;  R,  B 
A  ©  B ,  L]  L  h  R;  R  L;  L  h  R;  R ,  A&B 

A,L;L\~  R;  R  B,  L;  L  h  R;  R  L;L  h  R;  R,  A,  B 
A©  R,L;L  h  R;f?  L;  L  h  R;  R,  A^B 

A,  L;  L  h  R;  R,  B  A,  L;  L  h  R;  R,  B 

A  —  B ,  L]  L  h  R;  R  L;L  h  R;  R,  A  — >  B 

The  inversion  phase  draws  to  a  close  as  we  slowly  bring  formulas  from  L  and  R  into  the  stable 
context,  eventually  reaching  a  neutral  sequent: 
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L;  X+,L  h  R;R  L-,L\~K,X~-R 
X+,L;  Lb  R;R  L;Lh  R;  i?,  X- 

L;  A~,  LhR;fi  L;  L  b  R,  A+;  R  L;LhR,vl+;i?  L;  L,  A"  hR;iJ 

[A,  L;L  h  R;  i?  L;  L  b  R;  R,  |A  b  A,  L;  L  b  R;  R  X;LbR;i?,nA 

L  b  R 
•;  L  b  R;  • 

To  prove  a  neutral  sequent  LbR,  we  must  pick  some  formula  in  L  or  R  to  focus  on: 

[A“];LbR  LbR;[A+] 

A-,Lb  R  L  b  R,  A+ 

Note  that  we  treat  the  stable  context  as  unordered,  so  the  focus  formula  can  come  from  anywhere 
in  the  context  (not  necessarily  from  the  perimeter).  During  the  focus  phase,  we  decompose  the 
formula  by  applying  logical  rules,  for  example: 

LibRpfA]  L2bR2;[R]  [A];LbR  [B];LbR 
Li,  L2  b  Ri,  R2;  [A  <8>  B\  [A&B] ;  L  b  R  [A&B] ;  L  b  R 

LbR;  [A]  Lb  R;  [B]  [R];  Li  b  Ri  [A];  L2  b  R2 

L  b  R;  [A  ©  B\  L  b  R;  [A  ©  B\  [A>S?R] ;  Li ,  L2  b  Ri ,  R2 

Li  b  Ri;  [A]  [R];L2bR2  Lt  b  Ri;  [A]  [R];L2bR2 

Li,  L2  b  Rj  ,  R2;  [A  —  B]  [A  — o  R];  Li,  L2  b  Ri,  R2 

Observe  that  in  these  rules,  the  premises  are  still  focused  sequents.  By  our  polarity  conventions, 
the  only  way  to  end  the  focus  phase  (other  than  by  reaching  a  logical  rule  with  no  premises, 
such  as  for  the  constants  1  and  _L)  is  by  reaching  either  an  atom,  a  shift  |,  or  negation 

LbR;A“  A+;  LbR  A+;LbR  L  b  R;  A~ 

X+  b  •;  [X+]  b  X-  LbR;  [l  A]  [|A];L  b  R  L  b  R;  [A  A]  [=.  A];L  b  R 

This  completes  our  definition  of  focusing  proofs  (summarized  in  Figure  3.2  on  a  representative 
fragment  of  logical  rules).  Again,  the  relevant  fact  for  proof  search  is  that  the  focusing  sequent 
calculus  is  complete  for  MALL  provability,  starting  from  an  inversion  phase.  Writing  \-y^  for 
provability  in  the  focusing  sequent  calculus,  we  have  the  following  (note  we  have  to  be  careful 
about  separating  invertible  formulas  from  the  stable  context): 

Theorem  3.2.1  (Completeness  of  X3-style  focusing).  If  L,  L  b^  R,  R  then  L;  L  b^j  R;  R 

This  is  a  more  structural  way  of  restating  Corollary  3.1.7.  The  focusing  rules  are  also  obviously 
sound  for  MALL  provability,  because  if  we  ignore  structural  punctuation,  each  is  either  an 
instance  of  an  ordinary  sequent  calculus  rule,  or  else  trivial  (i.e.,  the  conclusion  and  the  premise 
are  the  same). 

Theorem  3.2.2  (Soundness  of  Nj-style  focusing  for  MALL). 

•  If  L  b^j  R  then  Lb|R 

•  If  L  bp]  R;  [A]  then  L  b £  R,  A 

•  If  [A];  L  bj^j  R  then  A,  L  \~i  R 

•  If  L]  L  b[£j  R;  R  then  L ,  L  b i  R,  R 
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Contexts 


Left-stable 

L  ::=  -| 

L  ,A~ 

Right-stable 

R  ::=  ■  j 

R,A+ 

Left-invertible 

L  ::=  •  j 

L,A+ 

Right-invertible 

R  ::=  • 

R,  A~ 

Sequents 

Neutral 

LhR 

Right-focus 

LhR;  [A+] 

Left-focus 

[A~];L  b  R 

Inversion 

L;L  h  R;  R 

Inversion  phase 

L\  Lb  R;  R  A1 B1 L]  Lb  R;  R  L;Lh  R;  R1 A  L;Lh  R;  R1  B 
1,  L]  L  h  R;  R  A  B ,  L;L  h  R;  R  L\  L  h  R;  R ,  A&B  L\  Lb  R;  i?,  T 

A,  L;  L  h  R;  i?  B1  L\  L  h  R;  f?  L;Lh  R;  R,  A,  B  L;L  h  R;  i? 

0,  L;  L  h  R;  i?  A  ©  B ,  L\  L  h  R;  i?  L;L  h  R;  i?,  A^SB  L;  L  h  R;  /h  _L 

Inversion  — >  Neutral 

L;X+,LhR;f?  L;l',LhR;fi  L;LhR,#;i?  I;LhR,I-;i?  LhR 
X+,L;  Lh  R;i?  fA,  L;LhR;  i?  L;  L  h  R;  i?,  t  A  L;LhR;  f?,A~  ;  L  h  R; 

Neutral  — >  Focus 

LhR;[4+]  [A-];  LhR 

LhR,i+  A-,L  h  R 

Focus  phase 

Li  h  Ri;  [A]  L2  h  R2;  [B]  [A];  L  h  R  [5];  L  h  R 

•  h  •;  [1]  Li,L2  h  Ri,R2;  [A  <g>  B\  [A&B];LhR  [A&B];LhR  (no  rule  for  T) 

L  h  R;  [A]  Lh  R;  [B]  [B];  Li  h  Rx  [A];  L2  h  R2 

(no  rule  for  0)  L  h  R;  [A  ®  R]  L  h  R;  [A  ®  £?]  [A’S’R];  Li,  L2  h  Ri,  R2  [_L];-h- 

Focus  — >  Inversion/Initial 

LhR;  A-  A+;LhR 

X+  h  •;  [X+]  LhR;  [|A]  [T  A] ;  LhR  [X~];-\-X- 


Figure  3.2:  A  "X! 3,-style"  focusing  sequent  calculus  for  polarized  MALL  (fragment) 
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3.2.3  Focusing  proofs,  standing  back  and  squinting 

We  gave  some  background  and  intuition  in  §3.1,  but  the  focusing  sequent  calculus  defined 
in  §3.2.2  may  nonetheless  appear  daunting.  Not  only  have  we  yet  to  formally  demonstrate  the 
completeness  theorem  (merely  quoting  the  result  of  Andreoli  [1992]),  but  we  haven't  even  shown 
how  to  derive  seemingly  trivial  theorems,  such  as  the  initial  sequents 

A+-,-\~A+  and  A~\~--1A~ 

In  the  ordinary  MALL  sequent  calculus,  it  is  easy  to  build  a  derivation  of  A  b  A  bottom-up  by 
induction  on  the  structure  of  A,  starting  with  the  left  (right)  rule  if  A  is  positive  (negative),  and 
then  applying  the  right  (left)  rule.  For  example  when  A  =  B  <g>  C: 

Bh  B  ChC 
B,C  \~  B  ®C 
B®CV~  B®C 

With  focusing  proofs  this  approach  does  not  work,  because  the  derivation  must  finish  inverting 
B  and  C  before  applying  any  right  rules.  Essentially,  the  problem  is  that  the  syntactic  structure 
of  A  is  too  fine  a  level  of  granularity  for  reasoning  about  focusing  proofs. 

So  let  us  stand  back  and  try  to  get  a  larger  view  of  focusing,  taking  as  an  example  the 
composite  formula  C  =  [A®  ( j/1]  ©  { B2 ) .  To  show  C  in  right-focus,  we  must  begin  with  one  of 
the  following  derivations: 

L2  F  R2 ;-Bp  L2bR2;-B2 

U  L  Ri;A-  L2  F  R2;  [jffi]  U  F  Ri;A-  L2FR2;[IR2] 

Lt  F  Ri;  [|A]  L2  F  R2;  [IRi  0  |R2]  Lt  F  Ri;  [|A]  L2  F  R2;  [jRi  0  jB2] 

Li,  L2  F  Ri,  R2;  [|A  0  ( [B\  0  J.-B2)]  Li,  L2  F  Ri,  R2;  [|A  0  ( [B\  0  IR2)] 

Or  in  other  words,  collapsing  the  intermediate  steps,  with  one  of  the  following  derived  rules. 

Li  F  Ri;i'  L2  F  R2 ;  Rf  LiFRi;A_  L2  F  R2;  E2“ 

Li,  L2  F  Ri,  R2;  [[A  0  ( [B\  0  [B2)]  Li,  L2  F  Ri,  R2;  [[A  0  (jRi  0  J.-B2)] 

Note  each  rule  has  a  pair  of  invertible  sequents  as  premises,  each  with  a  single  invertible  formula. 
To  show  C  in  left-inversion,  we  must  begin  with  the  following  derivation: 

A~ ,  B ,  L  F  R  A~ ,  L?2  ,  L  F  R 
1Ri;A~,LFR  |R2;A©LFR 
[B\  0  |R2;  A~ ,  L  F  R 
[A,[B1  ©  |R2;L  F  R 
IA0(IR10IR2);LFR 

Or  collapsing  the  intermediate  steps,  with  the  following  derived  rule: 

A~ ,  B ,  L  F  R  A~ ,  B2  ,  L  F  R 
[A  0  (IBi  0  |R2);  L  F  R 

Note  the  rule  has  a  pair  of  neutral  sequents  as  premises,  each  with  two  formulas  in  addition  to 
the  stable  context  L  F  R.  Let  us  place  the  derived  focus  and  inversion  rules  side-by-side: 
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Li  b  Ri;  A  L2  b  R2;  B1 
Li,  L2  b  Ri,  R2;  [[A  ©  (|Ri  ©  I-B2)] 


A  ,  B ^  ,  L  b  R  A  .  B2  ,  L  b  R 
Li  b  Ri;  A~  L2bR2;R2  |A  ®  (|Ri  ©  |R2);  L  b  R 

Li,  L2  b  Ri,  R2;  [[A  ©  (|F>i  ©  IR2)] 

The  symmetry  is  now  obvious,  and  we  can  use  these  rules  to  easily  derive  an  initial  sequent  for 
C,  assuming  we  have  initial  sequents  for  its  negative  subformulas  A,  B\,  and  B2'. 

A~  b  •;  A~  B b  •;  Rj”  A~  b  ■;  A~  Bf  f“  "j  Bo 

^-,Rrb-;[|A©aRi©|R2)]'  A~ ,  R2  b  •;  [[A  ©  (jRi  ©  |R2)] 

A~,Bf  b  |A  ©  ( jRi  ©  |R2)  A~,R2  b  jA®  (|Ri  ©|R2) 

IA  ©  (|Ri  ©  |R2);  •  b  [A  ©  (|Ri  ©  |R2) 

How  do  we  generalize  from  this  example?  The  similarity  between  the  derived  rules  above  and 
Examples  2.1.9  and  2.1.10  from  the  previous  chapter  should  tip  us  off:  we  can  reformulate  the 
focus  and  inversion  phases  in  terms  of  patterns.  Where  we  previously  defined  proof  patterns 
and  refutation  patterns,  we  now  define  right  patterns  and  left  patterns. 

Definition  3.2.3  (Left/right  patterns).  A  derivation  of  L  lb  R;  [,4+]  is  called  a  right  pattern,  while  a 
derivation  of  [A~];L  lb  R  is  called  a  left  pattern.  Alternatively,  these  can  be  called  A-patterns.  The 
contexts  L  and  R  in  the  conclusion  of  an  A-pattern  are  called  its  frame,  and  more  specifically  L  is  its 
left-frame,  R  its  right-frame.  The  set  of  all  A-patterns  is  called  the  support  of  A. 

As  in  Chapter  2,  we  will  take  a  formula  to  be  literally  defined  by  its  support.  For  example,  we 
define  positive  conjunction  and  disjunction  as  follows: 

_  Li  lb  Ri;  [A]  L2  lb  R2;  [B] 

•  lb  ■;  [1]  Li,  L2  lb  Ri,  R2;  [A  ©  B] 

L  lb  R;  [A]  L  lb  R;  [B] 

(no  rule  for  0)  L  lb  R;  [A  ©  B]  L  lb  R;  [A  ©  B] 

So  far  these  look  just  like  the  right-focusing  rules  in  Figure  3.2,  only  replacing  b  with  lb.  The 
difference  is  that  before  where  we  had  rules  for  ending  the  focus  phase  and  transitioning  to 
inversion,  here  we  instead  give  axioms  for  building  right  patterns: 

X+  lb  •;  [X+]  A~  lb  ■;  [[A]  •  lb  A+;  [b  A] 

In  particular,  note  that  where  the  focusing  rules  for  [A  and  b  A  each  had  a  single  premise  with 
the  formula  A  in  right-  or  left-inversion,  respectively,  the  pattern  rules  for  these  connectives 
place  the  formula  A  in  the  left-frame  and  right-frame,  respectively. 

Similarly,  we  define  negative  conjunction  and  disjunction  with  left  pattern  rules  that  look 
just  like  their  left-focusing  rules: 

[A];  L  lb  R  [R];LlbR 
(no  rule  for  T)  [A&R];L  lb  R  [A&R];  L  lb  R 

[AjjLilbRr  [R];L!lbR2 
[-L] ;  ■  lb  •  [A^R];Li,L2  lb  Ri,R2 
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But  we  define  atomic  propositions,  j  and  -i  by  axioms: 


[X-];-lbX~  [tA];-\\- A+  [=>  A]\ A~  \\- ■ 

How  do  we  derive  the  focus  and  inversion  rules  from  these  patterns? 

It  is  helpful  to  first  define  some  more  "punctuation".  We  introduce  a  new  sequent  form 
>8  R/;  L  h  R;  0 1/,  which  is  expanded  by  multiplicative  rules: 

*8  R'fLi  h  Ri;  (g)  >gR'2;L2  b  R2;(g)L/2 

*8  RijR^;  Li,  L2  h  Ri,R2:(g)L/1,L,2  >$?  •;  ■  b  ■;  <g>  • 

A+]  L  h  R  LhR;A^ 

*8-,X+  b  -;<g>  AT+  A+;  L  h  R;  0  •  •;  L  b  R;  <g)  A~ 

Now,  the  rules  for  deriving  left-  and  right-focused  sequents  can  be  expressed  concisely: 

L'  lb  R';  [A+]  ^R';LbR;(g)L'  [^1“];  L'  Ih  R'  R';  L  b  R;  (g)  L' 

L  b  R;  [A+]  [A-];L  b  R 

As  can  the  rules  for  left-  and  right-inversion: 

[A-];L' IbR'  — ♦  L'LbR,R'  L'  lb  R';  [A+]  — ♦  L'LbR,R' 

L  b  R;  A~  A+;L  b  R 

Again,  the  notation - >  —  expresses  that  for  every  derivation  of  the  judgment  on  the  left,  the 

judgment  on  the  right  is  derivable.  So  for  example,  the  right-inversion  rule  says  that  to  derive 
L  b  R;  A~ ,  we  must  derive  a  set  of  neutral  sequents  L',  L  b  R,  R/,  for  every  L/  and  R/  forming 
the  left-  and  right-frame  of  an  .4-pattern. 

Example  3.2.4.  Let  C  =  [A®  (\H\  ©  .[ Bo ) ■  By  instantiating  these  rules  with  the  two  patterns  for 
C  (derivations  of  A~,Bp  lb  •;  [C]  and  .4 ~~ .  By  lb  •;  [C]),  we  obtain  exactly  the  focus  and  inversion 
rules  derived  above.  ■ 


This  completes  the  pattern-based  reformulation  of  focusing  proofs,  summarized  in  Figures  3.3 
and  3.4.  We  call  this  a  "reformulation"  because  it  does  not  essentially  change  the  structure  of 
proofs,  except  by  collapsing  multiple  steps  in  the  focus  and  inversion  phases. 

Notation.  We  associate  a  derivation  V  with  a  judgment  J  either  by  writing  the  the  derivation  over  the 
V 

judgment  J ,  or  by  separating  them  with  a  double-colon  V  ::  ,7- 

Lemma  3.2.5  (Focus  phase  collapse).  We  can  interpret  sequents  >8  R7;  L  b  R;(g)L'  in  the  Tj^-style 
system  as  above,  by  expanding  it  out  with  multiplicative  rules.  Then  there  is  a  S3  -style  proof  V  ::  (Lb 
R;  [A+])  (respectively  [A_];L  b  R)  iff  there  exists  L'  lb  R';  [A+]  (resp.  [A_];L'  lb  R')  and  a  Y,o,-style 
proof  of  ^R';LbR:(g)L'  built  out  of  subderivations  of  V. 

Proof.  By  induction  on  A.  □ 

Lemma  3.2.6  (Inversion  phase  collapse).  Let  L  =  Ajj . . ,  A^,  R  =  Ajn+1, . . . ,  A^+n.  There  is  a 
£ 3-style  proof  V  ::  (L;  L  b  R;  R)  iff  for  any  Li  lb  Rp  [Af],.. .  ,Lm  lb  Rm;  [A+],  and  [A“+1];Lm+i  lb 
R'/n+l/  ■  ■  ■  /  [b^m+n]  >  F m+n  b  R m-\-nr  there  is  a  proof  E  ..  (Lm^_^,  •  •  •  ?  In,  L  b  R,  Ri?  • .  •  ?  Rm_j-^),  where 
E  is  a  subderivation  of  V. 
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LilbR  i;[A]  L2lbR  2-,[B\ 

■  lb  •;  [1]  Ll5L2  lb  Rx,  R2;  [A  (g>  B] 

L  lb  R;  [A]  L  lb  R;  [B] 

(no  rule  for  0)  L  lb  R;  [A  ©  B\  L  lb  R;  [A  ©  B\ 

A+lb-;[A+]  A~  lb  •;  [J.A]  -lbA+;[^A] 


[A];LlbR  [B]\ L  lb  R 
(no  rule  for  T)  [A&B]- L  lb  R  [A&B];  L  lb  R 

[4];L!  IbRi  [B];Li  lb  R2 
[-L];  •  lb  •  [A*sB]\  L1;L2  lb  R1;  R2 

[X-];-lbX-  [TA];*ljbA+  [=.  ^4];  ^4“  lb  - 

Figure  3.3:  Definition  of  some  polarized  MALL  connectives  by  patterns 


Neutral  LbR 

Right-focus  LbR;  [A+] 

Left-inversion  A+;  L  b  R 
Right-inversion  L  b  R;  A~ 
Left-focus  |/1  +  ]:  LbR 

Multiplicative  >§*  R';  L  b  R;  0  L' 


L'  lb  R';  [A+]  >§>R';LbR;0L'  L'  lb  R';  [A+]  — ^  L',  LbR,R' 

LbR;[A+]  A+;LbR 

[A~];V  IbR'  — ♦  L',LbR,R'  [A~];V  IbR'  >§R';LbR;(g)L' 

L  b  R;  A-  [A-];Lb  R 

^RijLrbRi^Li  ^R^;L2bR2;®L' 
>S>R/1,R/2;L1,L2bR1,R2;0L'1,L'2 

A+;LbR  LbR;A- 

^■;X+b-;®I+  A+;  Lb  R;  0  ■  ^-;LbR;0A-  >§>  X-,  •  b  X-,  0  ■ 

LbR;[4+]  [4-];LbR 

LbR,4+  4-,LbR 


Figure  3.4:  A  pattern-based  formulation  of  MALL  focusing  proofs 


46 


Proof.  By  induction  on  L  and  R.  □ 

Corollary  3.2.7  (Equivalence  of  focusing  calculi).  'Writing  b^*i  for  provability  in  the  pattern-based 
formulation,  we  have: 

1.  LhM  R;[A+]  iJLhMR;[#] 

2.  A+;  L  E^*]  R  iff  A+]  L  R 

3.  L  R;  A~  iff  L  b^  R;  A~ 

4.  [i-];LhMRf[i-];LI-MR 

5.  L  R  L  R 

Proof.  By  induction  on  derivations,  applying  Lemmas  3.2.5  and  3.2.6.  □ 

Corollary  3.2.8  (Soundness  of  pattern-based  formulation). 

1.  If  L  R;  [A+]  then  L  h£  R,  A+ 

2.  If  [A-];  L  h^*]  R  then  A~,  L  \~£  R 

3.  If  L  b^*j  R;  A~  then  L  \~£  R,  A~ 

4.  If  A+;  L  b[£»]  R  then  A+ ,  L  hf  R 

5.  if  L  b™*i  R  then  Lhf  R 

6.  If  ^  R';  L  h^»]  R;  (g)  L'  then  (>§>  R'),  L  h£  R,  ((g)  I/) 

Proof.  (l)-(5)  follow  by  composing  Corollary  3.2.7  with  Theorem  3.2.2.  (6)  reduces  to  (3)  and 
(4)  by  expanding  ^R';L  b^.i  R;(g)L'  into  a  list  of  sequents,  applying  soundness,  and  then 
recombining  the  MALL  proofs  with  a  series  of  *8 L  and  ®R  rules  to  obtain  (fg  R'),  L  \-£  R,  ((g)  L'). 

□ 


3.2.4  Focusing  proofs  are  canonical  derivations 

While  the  correspondence  between  Ligures  3.2  and  3.4  is  fairly  direct,  what  should  be  even  more 
striking  is  the  correspondence  between  Ligure  3.4  and  the  linear  rules  of  Ligure  2.9.  Indeed, 
there  is  a  trivial  syntactic  isomorphism  between  them.  Lor  any  stable  context  L  h  R,  we  can 
build  a  corresponding  simple  frame  Ali  r: 

ALhR  =  {A-  I  A-  e  L}  U  {X+  |  X+  e  L}  U  {»A+  I  4+  E  R}  U  {»X~  \  X~  €  R} 

And  conversely,  given  any  simple  frame  A,  we  can  build  the  stable  context  La  h  Ra: 

La  =  {A~  |  A~  e  A}  U  {X+  |  X+  €  A} 

ra  =  {A+  |  mA+  eA)u  {X~  |  »X~  e  A} 

We  say  that  a  stable  context  L  h  R  and  a  frame  of  simple  hypotheses  A  are  interchangeable 
(written  (L  b  R)  < — >  A)  when  A  =  Ali-R/  L  =  La,  R  =  Ra-  We  extend  this  convention  to 
contexts  of  simple  hypotheses,  writing  (L  b  R)  < — >  L,  by  viewing  T  as  a  frame. 

Now,  let  (L  b  R)  < — >  T  and  (L7  b  R/)  < — >  A  be  interchangeable.  We  say  that  judgments  are 
interchangeable  as  follows: 
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L7  lb  R7;  [A+]  * — ►  A  lb  A+ 

L  b  R;  [A+]  * — >Th  A+ 

LbR;A"  < - >Tb  A" 

^  R7;  L  b  R;  (g)  L7  <■ - ►  T  b  A 


[A-];  L7  lb  R7  < — >  A  lb  *A 

A+;  L  b  R  < - >rb*A+ 

[A-];  LhR< - >  T  b  .A- 

LbR< - >Tb# 


Under  this  translation  guide,  the  rules  for  building  MALL  focusing  proofs  are  identical  to  the 
rules  for  building  linear  canonical  derivations.  As  a  corollary,  the  identity  and  composition 
principles  for  linear  canonical  derivations  can  be  translated  into  initial  sequents  and  cuts  on 
focusing  proofs. 


Theorem  3.2.9  (Initial  sequents).  The  following  initial  sequents  are  derivable  in  b^*i: 

1.  A+;-bA+ 


2.  A-  b  ■;  A~ 

3.  X+  b  •;  [X+] 

4. 

Theorem  3.2.10  (Cut-admissibility).  The  following  cuts  are  admissible  in  b  r^*i 

1.  If  Li  b  Ri;  [A+]  and  A+;  L2  b  R2  then  Li,  L2  b  Ri,  R2 

2.  If  hi  b  Ri;  A~  and  [A-];  L2  b  R2  then  Li,L2  b  Ri,  R2 

3.  If  R7;  Li  b  Ri;  (g)  L7  and  L7,  L2  b  R2,  R7;  [ A+ ]  then  Li,  L2  b  Ri,  R2;  [A+] 

4.  If  >§>  R7;  Li  b  Ri;  (g)  L7  and  [A~];  L7,  L2  b  R2,  R7  then  [A-];  Li,  L2  b  Ri,  R2 

5.  If  >£?  R7;  Li  b  Ri;  (g)  L7  and  L7,  L2  b  R2,  R7;  A~  then  Li,  L2  b  Ri,  R2;  A~ 

6.  If  R7;  Li  b  Ri;  (g)  L7  and  A+;  L7,  L2  b  R2,  R7  then  A+;  Li,  L2  b  Ri,  R2 

7.  If  R7;  Li  b  Ri;  (g)  L7  and  ^  R77;  L7,  L2  b  R2,  R7;  (g)  L77  then  R77;  Li,  L2  b  Ri,  R2;  (g)  L77 

8.  If  >£?  R7;  Li  b  Ri;  (g)  L7  and  L7,  L2  b  R2,  R7  then  Li,L2  b  Ri,  R2 

Proof.  These  are  all  instances  of  the  identity  and  composition  principles  for  linear  canonical 
derivations,  under  the  translation  guide.  □ 


The  translation  guide,  in  effect,  says  that  focusing  proofs  and  canonical  derivations  are  the  same 
thing.  On  the  other  hand,  the  long  list  of  cut  principles  in  Theorem  3.4.8  gives  a  suggestion 
as  to  why  we  introduced  the  notion  of  canonical  derivations  in  the  first  place,  beyond  any 
philosophical  motivation:  it  is  simply  a  better  notation.  Cuts  (3-8)  are  all  instances  of  a  single 
composition  principle: 

Composition  (substitution).  IfTi(A)  b  J  and  T2  b  A  then  Ti(T2)  b  J 
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3.2.5  Complex  hypotheses  and  weak  localization 

If  canonical  derivations  in  a  context  of  simple  hypotheses  correspond  to  focusing  proofs,  what 
happens  when  we  add  complex  hypotheses?  Following  the  convention  we  established  above, 
complex  proof  hypotheses  A+  e  T  should  become  positive  formulas  on  the  left  side  of  a  sequent, 
while  complex  refutation  hypotheses  »A~  e  F  should  become  negative  formulas  on  the  right  side. 
But  such  sequents  are  not  stable!  In  other  words,  contexts  of  arbitrary  hypotheses  correspond  to 
arbitrary  sequents.  If  we  don't  care  about  separating  simple  from  complex  hypotheses,  we  can 
state  the  criteria  for  interchangeability  (£  h  91)  < — >  T  more  concisely: 


r£h<K  —  £>  *91 

£r  =  {A  I  A  e  r}  9tr  =  {A  |  #A  e  T} 

Now,  as  we  discussed,  the  first  step  in  a  bottom-up  search  for  a  focusing  proof  of  L;  L  b  R;  R 
is  to  eagerly  invert  the  formulas  in  L  and  R,  creating  a  set  of  stable  sequents  as  goals.  But  the 
rules  for  using  complex  hypotheses  do  not  force  such  a  discipline.  Recall  the  rules: 

A  IF  A+  — ♦  T(A)  F  J  A  IF  •A~  — ♦  T(A)  F  J 
T(A+)  F  J  r(.A-)  F  J 

We  are  allowed  to  retain  complex  hypotheses  as  long  we  wish  and  invert  them  at  any  stage — even 
while  another  formula  is  in  focus.  Therefore,  in  the  terminology  of  Laurent  [2004a],  canonical 
derivations  with  complex  hypotheses  correspond  to  proofs  that  are  weakly  focalized,  as  opposed 
to  fully  focalized.  On  the  other  hand,  Laurent's  observation  was  that  a  weakly  focalized  proof 
can  be  easily  converted  into  a  fully  focalized  one,  precisely  because  invertible  rules  are  invertible. 
In  other  words,  a  canonical  derivation  with  complex  hypotheses  can  be  trivially  converted  into 
one  with  only  simple  hypotheses,  by  eagerly  applying  pattern  substitution  (Prop.  2.1.14). 


3.3  Completeness  of  focusing  proofs 

In  this  section  we  prove  the  completeness  of  focusing,  using  the  interpretation  as  canonical 
derivations.  Our  proof  is  similar  in  structure  to  that  of  Laurent  [2004a],  who  simplified  Andreoli's 
original  proof  [1992],  We  prove  a  weak  focalization  lemma,  and  then  show  this  implies  full 
focusing. 

Lemma  3.3.1  (Weak  focalization).  Let  (£  F  91)  < — >  T.  Then  £  \-f  91  implies  there  is  a  linear  canonical 
derivation  o/TF#. 

Proof.  By  induction  on  the  MALL  sequent  calculus  proof.  Without  loss  of  generality,  we  can 
assume  the  MALL  derivation  does  not  have  any  uses  of  cut,  and  that  init  is  restricted  to  atomic 
formulas.  For  atomic  initial  sequents,  we  directly  apply  the  identity  principle  for  linear  canonical 
derivations.  Otherwise,  the  derivation  ends  in  a  logical  rule. 

There  are  essentially  two  kinds  of  cases:  the  rule  introduces  a  formula  that  is  either  in  the 
inverting  context  ( A+  e  £  or  A  in  91)  or  in  the  stable  context  (A+  £  91  or  A  '  e  £).  The  negative 
polarity  cases  are  dual  to  positive  polarity,  so  we  show  here  only  some  illustrative  examples 
where  a  positive  formula  is  introduced  on  the  left  (invertible)  or  right  (stable). 
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£1  h  9h,A  £2h  ft2,B 

•  Case  <g>R:  The  proof  ends  in  £1,  £2  b  9Tl,  £2 ,  A®  B  ,  where  (£1  h  9Tl)  < — »  hi  and  (£2  b 
9t2)  ' — »  r 2.  By  the  induction  hypothesis,  there  exist  linear  canonical  derivations  (lcds) 
Ci  ::  (TU»A  h  #)  and  C2  ::  (r2,»-B  h  #).  Now,  note  that  for  any  patterns  pi  ::  (Ai  lb  A) 
and  P2  ::  (A2  II-  B),  we  can  build  a  derivation  C(pljP2)  ::  (»A  <g>  B ,  Ai,  A2  h  #)  as  follows: 

Pi  P2 

Ax  lh  A  A2  lh  B  u 

Ai,A2lh  A®B  Ai,  A2  h  Aj,  A2 
Ai,  A2  h  A  ®  B 

C(pi,P2)=  *A®  B,Ai,A2  h  # 

Then  we  derive  Id,  r2,  «A  ®  /]  h  with 


Ci 

rWb# 


P2  ^(Pl,P2) 

C2  A2  lh  B  — >  »A  (8)  B,  Ai,  A2  h  # 
p,  r2,»i?h#  »A  <g>  B,  Ai  h  •_£> 

Ai  lh  A  — *  r2,«A®B,Ai  b  #  ' 

r2,  *A  ®  B  h  »A 

ri,r2,»d0Bh  #  1 


where  (f)  indicates  uses  of  the  composition  principle  (the  right  derivation  being  substituted 
into  the  left).  Observe  that  the  derivation  here  is  a  bit  arbitrary:  we  could  just  as  well  use 
Ci  and  C2  in  the  opposite  order. 

£h  91,  A  £  h  m,B 

•  Case  ®R:  The  proof  ends  in  either  £  b  A  ©  B  or  £  b  A  ©  B,  where  (£  b  9T)  < — >  P. 
The  two  cases  are  symmetric,  so  assume  the  former,  and  by  the  i.h.  there  exists  an  led 
C  ::  (r,  #A  b  #).  Analogously  to  (case  <g>),  we  first  note  that  for  any  p  ::  (A  lb  A),  there 
exists  a  derivation  C\n\p  ::  (»A  ®  B,  Ab#): 


P 

A  lb  A  id 
A  lb  A  ©  .B  AbA 
A  h  A  ©  B 

Qn\p=  ‘A®B,Ab# 


Then  we  construct 


C 

•A  b# 


P 

A  lb  A 


•A  1 


Cinl  p 

)B,Ab# 


»A©B  b  *A 


r,.A©Bb# 


t 


£h  9t,A 

•  Case  f /?,:  The  proof  ends  in  £  b  9T,  |A,  where  (£  b  < — >  T.  By  the  i.h.,  there  exists  an 

led  C  ::  (r,  •A-).  Then  we  construct: 
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c 


r,«d-  b# 
r  b  a-  . 

T  b  |d  11 
r>|db# 


t2 


using  the  derived  rule  f  i  for  proving  [A  (Prop.  2.3.1),  and  the  admissible  step  f2  introducing 
a  complex  hypothesis. 

•  Case  <S>L:  By  the  i.h.,  there  is  an  led  C  ::  (T,  d+,  B+  h  #).  We  immediately  derive  T,  A  <g>  B  b 
#  using  the  analysis  rule,  by  noting  that  every  A  <g)  //-pattern  decomposes  as  a  pair  of  an 
d-pattern  and  a  /i-pattern,  and  applying  pattern  substitution  for  each  hypothesis. 

•  Case  [L:  By  the  i.h.,  there  is  an  led  C  ::  (r,  A~  h  #).  We  immediately  derive  T,  [A  h  #  by 
applying  the  analysis  rule,  noting  that  there  is  only  a  single  (d-pattern,  with  frame  d"  . 

ft 

Corollary  3.3.2  (Focusing  completeness).  If  £  b  91  has  a  MALL  proof  then  it  has  a  fully  focusing 
proof. 

Proof.  Although  we  introduced  complex  hypotheses  in  the  proof  of  weak  focalization  (e.g.,  in 
cases  (IR)  and  (<8>L)),  these  were  admissible  steps,  rather  than  canonical  rules.  The  canonical 
derivation  that  results  from  weak  focalization  never  introduces  complex  hypotheses — it  only 
decomposes  them — and  so  by  pattern  substitution,  we  can  perform  this  decomposition  as  the 
first  (bottom-up)  step  of  a  fully  focusing  proof.  □ 

Corollary  3.3.3.  Weak  entailment  for  linear  canonical  derivations  coincides  with  MALL  entailment,  i.e., 
d+  <-  B+  (or  A~  <+  B~)  iff  A  B. 

Proof.  In  the  positive  case,  by  definition,  we  have  d+  <“  B+  iff  there  is  a  led  of  •// :  b  «d+, 
which  is  equivalent  to  »B+,A+  b  ff.  This  is  true  if  (Lemma  3.3.1)  and  only  if  (Corollary  3.2.8) 
d+  hf  B  1 .  Likewise,  in  the  negative  case,  by  definition  A~  <+  B  iff  there  is  a  led  of  d  b  B  , 
which  is  equivalent  to  »B~ ,  d+  b  ff,  and  holds  if  and  only  if  d  b i  B  .  □ 


3.4  Unrestricted  derivations  and  classical  sequent  calculus 

We  have  seen  the  close  correspondence  between  linear  canonical  derivations  and  Andreoli's 
focusing  proofs  for  MALL,  and  examined  the  relationship  with  MALL  itself  via  soundness  and 
completeness  theorems.  Now,  we  will  show  how  unrestricted  canonical  derivations  are  in  the 
same  sort  of  relationship  with  classical  logic. 


3.4.1  Polarizations  of  classical  logic 


To  get  to  this  result  as  quickly  as  possible,  we  will  take  as  our  axiomatization  of  classical  logic 
the  polarized  MALL  sequent  calculus,  together  with  explicit  structural  rules  of  weakening  and 
contraction: 


£hfft 

d,£  bin 


WL 


d,d,£b9t 
d,  £  b  93 


£  b  93,  d,  d 
£  b  93,  d 


£b  93 

£  b  93,  d 


WR 
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We  write  Fc  for  provability  in  this  sequent  calculus.  As  is  well-known,  this  axiomatization 
contains  a  lot  of  redundancy.  For  example,  the  two  forms  of  conjunction  <g)  and  &  are  logically 
equivalent,  as  are  the  two  forms  of  disjunction.4  The  different  negations  (F  ,  -=i )  are  also  logically 
equivalent,  although  they  already  were  in  MALL.  Formally,  we  are  working  with  formulas  in 
the  syntax  of  PPL  (cf.  Definition  2.3.4),  which  correspond  to  different  polarizations  of  classical 
propositions.  To  make  this  precise,  let  us  use  letters  a,b,c  to  range  over  formulas  of  propositional 
logic,  built  out  of  atoms  and  the  connectives  T,  F,  A,  V,  ~  . 

Definition  3.4.1  (Polarization).  Let  |  —  |  be  the  map 

|1|  =  |T|  =  T  |0|  =  |  _L  |  =  F  \X\  =  X 

\A  <g)  B\  =  \A&B\  =  \A\  A  \B\  \A  ©  B\  =  \A>$B\  =  \A\  V  \B\ 

|-u4|  =  ~|A|  \IA\  =  \A\ 

\A  -*•  B\  =  ~|A|  V  \B\  \A- B\  =  \A\  A~|5| 

from  PPL  formulas  to  formulas  of  propositional  logic.  A  polarization  of  b  is  a  PPL  formula  A  such  that 
b  and  |A|  are  classically  equivalent.  We  extend  the  terminology  pointwise  to  sequents  of  formulas. 

Example  3.4.2.  Among  the  (infinitely  many)  polarizations  of  X  A  Y  are: 

X+®Y+  |A-®F+  X~  8x]Y+  |TX+®Y+  ](X+  ®  Y+)  ... 


Proposition  3.4.3.  Let  a±, . . . ,  am  F  b\, . . .  ,bn  be  a  classical  sequent,  and  £  F  91  he  a  polarization.  Then 
ai, . . . ,  am  F  bi, . . . ,  bn  is  classically  true  iff  £  Fc  91. 

Proof.  Standard,  reading  the  rules  under  the  map  |  — |.  □ 

Proposition  3.4.3  says  that  we  can  treat  a  classical  formula  and  its  polarization  into  PPL  as  inter¬ 
changeable  in  terms  of  provability.  In  terms  of  proof  search,  however,  we  can  view  polarization 
as  committing  to  a  particular  focusing  strategy  for  proving  the  classical  formula.  The  soundness 
and  completeness  theorems  will  tell  us  that  all  strategies  are  acceptable. 

3.4.2  Focusing  proofs  for  classical  sequent  calculus 

As  in  §3.2.2  and  §3.2.3,  we  give  two  essentially  equivalent  formulations  of  focusing  proofs:  one 
in  the  style  of  Andreoli's  £3  (Figure  3.5)  and  the  other  in  terms  of  patterns  (Figure  3.6).  The 
£3-style  calculus  for  PPL  is  almost  identical  to  the  one  for  MALL  (recall  Figure  3.2),  except  in 
the  following  ways: 

1.  In  the  rules  for  focusing  on  a  particular  formula  (group  "Neutral  — >  Focus"),  the  focus 
formula  is  retained  inside  the  stable  context. 

2.  In  the  focusing  rules  for  <g>  and  YS,  the  entire  context  is  passed  to  both  premises,  rather 
than  a  nondeterministic  splitting. 

4Since  formulas  must  be  well-polarized,  we  write  these  equivalences  as  a  pair  of  equivalences,  e.g.,  JM  ®  [B  = 
T (A&B)  and  | (A  0  B)  =  ]A^B- 
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Inversion  phase 
(as  in  Figure  3.2) 

Inversion  — *  Neutral 
(as  in  Figure  3.2) 

Neutral  — *  Focus 

4+eR  LhR;[yl+]  A~  e  L  [i-];LhR 

LhR  LhR 


Focus  phase 

LhR;  [A]  LhR;  [B\  [i];LhR  [B)\ LhR 
LhR;  [1]  LhR;[4«B]  [A&5];Lh  R  [A&B];Lh  R  (no  rule  for  T) 

LhR;  [A]  LhR;  [B\  [R];LhR  [A] ;  L  h  R 

(no  rule  for  0)  L  h  R;  [A ©  B\  L  h  R;  [A  ©  B\  [A>S>B];L  h  R  [l];LhR 

Focus  — >  Inversion/Initial 

X+  g  L  L  h  R;  A-  A+;L  h  R  X~  g  R 
LhR;[I+]  LhR;[|A]  [|  A] ;  L  h  R  [X’];LhR 


Figure  3.5:  A  "Lj-style"  focusing  sequent  calculus  for  PPL 


L'  lh  R';  [A+]  >£*  R'; LhR;(g)L'  V  Ih  R';  [A+]  — ■+  L', LhR,R' 

LhR;  [A+]  A+;L  h  R 

[A“];L'lhR'  — »■  L',  L  h  R,  R'  [A-];L'lhR'  >§R';L  h  R;(g)L' 
L  h  R;  A-  [A-];L  h  R 

>S>Ri;LhR;®Li  ^R^;LhR;0L' 


^R'hR';LhR:®L',A' 


A+;L  h  R 


>S>-;LhR;®- 


L  h  R;  A- 


•;  X+,  LhR;®  X+  A+;  L  h  R;  ®  ■  •;  L  h  R;  ®  A~  X-, L  h  R,  X-,  ®  ■ 

A+eR  LhR;  [A+]  A~  e  L  [A-];LhR 


LhR 


LhR 


Figure  3.6:  A  pattern-based  formulation  of  PPL  focusing  proofs 
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3.  In  the  focusing  rules  for  1  and  _L,  and  the  atomic  initial  sequents,  the  rest  of  the  context 
need  not  be  empty. 

Basically,  we  have  made  exactly  those  changes  necessary  to  ensure  the  admissibility  of  weakening 
and  contraction  on  formulas  inside  the  stable  context.  Writing  h[c]  for  provability  in  this  calculus, 
and  letting  a  and  c  range  over  "assumptions"  (A~  or  X+)  and  "conclusions"  (A+  or  X~): 


Weakening  (Left). 

•  If  L  b[c]  R  then  o,  L  R 

•  If  L  h[c]  R;  [A]  then  a,  L  b[c]  R;  [A] 

•  If  [ A ];  L  h[c]  R  then  [A];  a,  L  b[c]  R 

•  If  L]  L  I —  |-c]  R;  R  then  L ;  a,  L  hjc]  R;  R 

Contraction  (Left). 

•  If  a,  a,  L  h[c]  R  then  a,  L  L[c]  R 

•  If  a,  a,  L  b[c]  R;  [A]  then  a,  L  b[c]  R;  [A] 

•  If  [A];  a,  a,  L  Kci  R  then  [A];  a,  L  L[c]  R 

•  If  L\  a,  a,  L  L[c]  R;  R  then  L ;  a,  L  L[c]  R;  R 


Weakening  (Right). 

•  If  L  L[c]  R  then  L  I —  R,  c 

•  If  L  h[c]  R;  [A]  then  L  I — ^  R,  c;  [A] 

•  If  [v4] :  L  L[c]  R  then  [A|;  L  Ljc]  R,  c 

•  If  L;  L  L[c]  R;  R  then  L;  L  h[cj  R,  c;  R 

Contraction  (Right). 

.  IfL  b[c]  R,  c,  c  then  L  L[c]  R,  c 

•  If  L  L[c]  R,  c,  c;  [^4]  then  L  Hci  R,  c\[A] 

•  If  [A]\h  h[c]  R,  c,  c  then  [y4];  L  L[c]  R,  c 

•  If  L;  L  h[c]  R,  c,  c;  R  then  L;  L  h[c]  R,  c;  R 


Proof.  Immediate  by  induction  on  derivations.  fyf 

Since  we  included  the  structural  properties  explicitly  in  our  axiomatization  above,  and  we  treat 
them  implicitly  in  the  focusing  calculus,  soundness  is  not  quite  as  immediate  as  it  was  in  §3.2.2, 
but  it  is  nonetheless  easy  to  see. 

Theorem  3.4.4  (Soundness  of  X^-style  focusing  for  PPL). 

•  If  L  L|-c]  R  then  L  Lc  R 

•  If  L  h[cj  R;  [A]  then  L  \~c  R,  A 

•  If  [A];  L  b[c]  R  then  A,  L  bc  R 

•  If  L]  L  L[c]  R;  R  then  L,  L  \~c  R,  R 

Proof.  If  we  erase  structural  punctuation,  almost  every  rule  becomes  an  instance  of  an  ordinary 
sequent  calculus  rule  or  trivial.  The  focusing  rules  for  <g)  and  X  are  justified  from  XR  and  XL 
by  repeated  use  of  CL  and  CR,  and  the  focusing  rules  for  1  and  _L  from  1 R  and  XL  by  repeated 
use  of  WL  and  W II.  Likewise,  the  atomic  initial  sequents  are  justified  by  repeated  use  of  ILL 
and  WR.  Finally,  the  two  rules  in  the  group  "Neutral  — >  Focus"  become  instances  of  CR  and 
CL,  respectively.  □ 
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Similarly,  the  pattern-based  formulation  of  focusing  for  PPL  is  almost  identical  to  the  one 
for  MALL,  except  that  the  rules  for  proving  neutral  sequents  LhR  and  multiplicative  sequents 
>£?  R';  L  h  R;(g)L'  are  modified  to  build  in  weakening  and  contraction.  As  before,  writing  b[c»] 
for  provability  in  the  pattern-based  formulation,  we  have: 

Proposition  3.4.5  (Equivalence  of  focusing  calculi). 

1.  L  L[c*]  R;[A+]  iff  Lb[c]  R;  [A+] 

2.  A+;  L  h[c»]  R  iff  A+;  L  L[c]  R 

3.  L  h[c«]  R;  A~  iff  L  b[c]  R;  A+ 

4.  [A~]  \  L  h[c*,  R  iff  [A+];  L  h[c]  R 

5.  L  L[c«]  R  iff  L  l-[c]  R 

Corollary  3.4.6  (Soundness  of  pattern-based  formulation). 

1.  IfL  b[c*]  R;  [A+]  then  L  hc  R,  A+ 

2.  If  [A-];  L  b[c«]  R  then  A~,  L  Lc  R 
3-  If  L  h[c*j  R;  a  then  L  Lc  R,  a 

4.  If  c]  L  L[c*]  R  then  c,  L  Lc  R 

5.  If  L  I  [c*]  R-  then  L  Lc  R 

6.  If  R';  L  h[c»]  R;  (g)  L'  then  (>g>  R'),  L  bc  R,  ((g)  L') 

3.4.3  Classical  focusing  proofs  are  unrestricted  canonical  derivations 

This  punchline  should  by  now  be  unsurprising:  the  pattern-based  formulation  of  focusing  for 
PPL  is  identical  to  our  presentation  of  unrestricted  canonical  derivations  with  only  simple  hy¬ 
potheses,  under  the  same  translation  as  in  §3.2.4.  As  a  corollary,  we  derive  initial  sequents  and 
cut  principles  for  focusing  proofs  by  translating  the  corresponding  identity  and  composition 
principles  for  unrestricted  canonical  derivations. 

Theorem  3.4.7  (Initial  sequents).  The  following  initial  sequents  are  derivable  in  Lrc*i: 

1.  IfA+e  R  then  A+;  LhR 

2.  If  A~  €  L  then  LhR;  A~ 

3.  If  X+  €  L  then  X+  b  ■;  [X+] 

4.  If  X~  e  R  then  [A-];  •  b  X~ 

Theorem  3.4.8  (Cut-admissibility).  The  following  cuts  are  admissible  in  b[c*j: 

1.  1/LbR;  [A+]  and  A+;  L  b  R  then  L  b  R 

2.  If  L  b  R;  A~  and  [A~] ;  L  b  R  then  LbR 
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3.  If  R';  L  h  R;  (g)  L'  and  L',  LhR,  R';  [A+]  then  LhR;  [A+] 

4.  If  >§>  R';  L  h  R;  0  L'  and  [A“]; L',  LhR,R'  then  [A“];  L  h  R 

5.  If  R';  L  h  R;  0  L/  and  L',  LhR,  R';  A~  then  L  h  R;  A~ 

6.  If  R';  LhR;(g)  L'  and  A+;  L',  L  h  R,  R'  then  A+;  LhR 

7.  If  ^R';LhR;®  L'  and  R";  L',  LhR,  R';  0  L"  then  R";  LhR;(g)  L" 

8.  If  R';  LhR;(g)L'  and  L',  LhR,R'  then  LhR 

Proof.  These  are  all  instances  of  the  identity  and  composition  principles  for  unrestricted  canonical 
derivations,  under  the  translation  guide.  □ 

3.4.4  The  completeness  theorem 

Again  we  prove  the  completeness  of  focusing  by  way  of  weak  focalization,  with  the  proof  almost 
identical  to  the  one  in  §3.3. 

Lemma  3.4.9  (Weak  focalization).  Let  (£  h  93)  « — >  T.  Then  £  hc  93  implies  there  is  a  unrestricted 
canonical  derivation  ofT  h  #. 

Proof.  The  only  difference  with  the  proof  of  Lemma  3.3.1  is  that  sometimes  we  need  to  invoke 
the  weakening  and  contraction  principles.  We  illustrate  with  a  single  example,  the  ®R  case: 

£i  h  93i,  A  £2h  932,R 

•  Case  <8):  The  proof  ends  in  £i,  £2  h  93i,  932,  A  <g>  B  ,  where  (£1  h  93i)  <■ — *  Ti  and  (£0  h 
932)  « — >  T2-  By  the  induction  hypothesis,  there  exist  unrestricted  canonical  derivations 
(ucds)  C\  ::  (Ti,  «A  h  #)  and  C2  ::  (T2,  »B  h  #).  Let  T  =  Ti,  T2,  »A  ®  B.  Note  that  for  any 
pi  ::  (Ai  lh  A)  and  p2  ::  (A2  Ih  B),  we  can  build  the  derivation  ::  (T,  Ai,  A2  h  #)  as 

follows: 


Pi  P2 

Ai  Ih  A  A2  Ih  B 


Id 


•A  <g>  B  <E  T 


C, 


(P1>P2) 


Ai,  A2  Ih  A  (g)  -£>  T,Ai,A2  h  Ai,A2 
T, Ai, A2  hA®B 


T,  A1;  A2  h  # 


Then  we  derive  T  h  #  with 


Ci  Pi 

IT,  »A  h  #  Ai  Ih  A 
T,*A  h#  T 


Th# 


C2 

r2,»5h# 

r,Al5.Rh# 


t 


P2 

Ao  Ih  B 


^(Pl,P2) 

T,  A1}  A2  h  # 


T,  Ai  h 


r,Ar  h# 


t 


Th.A 


where  (f )  indicates  uses  of  substitution  (right  into  left),  and  (f)  indicates  weakening.  Again, 
note  that  our  choice  to  substitute  Ci  and  C2  in  this  order  is  arbitrary. 
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A  e  r 
r  b  a 


hyv 


r,bib# 
r  h  ~  a 


r  h  ~  a  r  h  a 
rb# 


r^E 


r  b  a  r  b  b 
r  h  aa  b 


r  b  A  AB 
rb  a 


A  E 


r  b  A  AB 
TbB 


A  E 


rb  a 
r  b  A  VB 


Vi 


rb  b 
r  b  A  VB 


VI 


T  AV  B 


T,APC  r,BbC 
r  b  c 


VE 


r  b  t 


T I 


r  b  f 

r  b  c 


F  E 


(Note:  #  is  a  distinguished  atomic  formula,  different  from  F.) 


Figure  3.7:  Natural  deduction  for  conjunction,  disjunction,  and  minimal  negation 


□ 

Corollary  3.4.10.  Weak  entailment  for  unrestricted  canonical  derivations  coincides  zvith  classical  entail- 
ment,  i.e.,  A+  <~  B+  (or  A~  <+  B~)  iff  |A|  D  \B\  is  a  classical  theorem. 

Proof.  As  in  the  proof  of  Corollary  3.3.3,  then  applying  Proposition  3.4.3.  □ 

Corollary  3.4.11.  Let  b  be  a  classical  theorem: 

1.  For  any  positive  polarization  A+  of  b,  there  is  an  unrestricted  canonical  derivation  of  •A*  b  # 

2.  For  any  negative  polarization  A~  of  b,  there  is  an  unrestricted  canonical  derivation  of  ■  P  A~ 

3.5  Relating  focusing  and  double-negation  translations 

Take  a  look  at  the  two  parts  of  Corollary  3.4.11,  remembering  how  we  glossed  the  different 
hypothetical  judgments  in  Chapter  2.  The  conclusion  of  (1)  we  read  aloud  as,  "If  A  is  refutable, 
then  contradiction".  The  conclusion  of  (2)  we  read  simply  as  “A  is  provable",  but  for  a  nega¬ 
tive  notion  of  proof-by-contradiction.  It  seems  then  that  focusing  proofs  ,  via  their  one-to-one 
correspondence  with  canonical  derivations,  give  us  different  double-negation  interpretations  of 
classical  propositions — and  the  completeness  of  focusing  corresponds  to  the  completeness  of 
these  interpretations. 

How  does  this  relate  to  the  traditional  double-negation  translations  of  classical  into  intuition- 
istic  or  minimal  logic?  To  answer  this  question,  we  first  relate  canonical  derivations  to  proofs  in 
minimal  logic.  A  standard  natural  deduction5  for  minimal  logic  is  given  in  Figure  3.7.  Consider 
the  following  pair  of  maps  (— )m+  and  (— )m~  from  polarized  to  unpolarized  formulas: 

"'Note  we  could  also  consider  a  sequent  calculus  presentation,  which  would  make  some  aspects  of  the  correspon¬ 
dence  with  canonical  derivations  more  direct.  However,  we  are  anticipating  Chapter  4,  where  we  will  relate  (the 
Curry-Howard  interpretation  of)  canonical  derivations  with  terms  of  A-calculus. 
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xm+ 

=  X 

xrn- 

=  ~A 

j^m+ 

=  T 

1  m— 

=  T 

0m+ 

=  F 

-1-771— 

=  F 

(. A®B)m+ 

=  Am+  V  Bm+ 

(. A&B)m ~ 

=  Am~ V  Bm 

(. A®B)m+ 

=  Am+  A  Bm+ 

(A^B)m- 

=  Am~ A  Bm 

(±. A)m+ 

= 

(AA)m~ 

=  ~Am~ 

UA)m+ 

=  ~  Am~ 

(t  A)m~ 

=  ~Am+ 

(. A  -  B)m+ 

=  Am+  A  Bm~ 

(A  —>  B)m~ 

=  4m+Ar 

Note  that  ~  A  stands  for  minimal  negation,  i.e.,  negation  defined  by  ~  .4  =  .4  D  f,  where  f  is 
a  distinguished  atomic  formula  (as  opposed  to  the  intuitionistic  definition  ~  A  =  A  D  F).  If  we 
wanted  we  could  be  more  explicit  and  write  A,  since  the  atom  #  is  arbitrary. 

We  write  Am  for  the  translation  of  an  arbitrary  polarity  formula  (Am+  when  A  is  positive, 
Am~  when  A  is  negative).  Observe  that  for  the  purely  positive  fragment  of  PPL,  Arn  is  equal  to 
the  forgetful  translation  |A|  we  defined  in  Definition  3.4.1.  For  the  negative  fragment,  Arn  is  the 
De  Morgan  dual  of  |A|.  The  translation  is  extended  to  assertions  and  refutations  as  follows: 

( A+)m  =  Am+  {•A~)rn  =  Am-  {•A+)m  =  ~Am+  (A-)m  =  ~Am~ 

The  contradiction  judgment  f  is  translated  as  the  distinguished  atom  Frames  A  are  translated 
on  the  right  by  treating  them  as  big  conjunctions: 

(•)m  =  T  (Ai,A2)m  =  Af  A  XV? 

On  the  left,  appearing  in  contexts,  frames  are  translated  into  lists  of  assumptions: 

(•)m  =  -  (A1,A2)m  =  (Ar,A^) 

so  that  polarized  contexts  can  be  translated  into  minimal  contexts: 

(•)m  =  -  (r,A)m  =  (rm,Am) 

Theorem  3.5.1  (Soundness  of  translation  into  minimal  logic).  Any  unrestricted  canonical  derivation 
ofT\~J  can  be  transformed  into  a  minimal  logic  proof  of  Tm  b  Jm. 

Proof.  We  first  establish  the  following  facts: 

1.  Any  ,4-pattern  with  frame  A  can  be  transformed  into  a  minimal  logic  proof  of  Am  b  Am 

2.  Suppose  that  for  every  .4-pattern,  there  is  a  minimal  logic  proof  of  Tm,Am  b  Jm,  where 
A  is  the  frame  of  the  pattern.  Then  there  is  a  minimal  logic  proof  of  Tm,  Am  b  Jm 

These  are  both  obvious  by  inspection  of  the  pattern-formation  rules.  The  theorem  follows  im¬ 
mediately,  by  induction  on  canonical  derivations.  □ 

Theorem  3.5.2  (Completeness  of  translation  into  minimal  logic).  If  Tm  b  Jm  has  a  minimal  logic 
proof  then  there  is  an  unrestricted  canonical  derivation  ofT\-J. 
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Proof.  By  induction  on  the  minimal  natural  deduction  proof,  similar  to  the  proof  of  weak  focal- 
ization.  We  give  a  few  representative  cases: 

Tm,Am  h  # 

•  Case  ~7:  The  proof  ends  in  Tm  h  ~  Arn  .  Note  that  ~  Am  could  be  the  translation  of  many 
different  judgments,  e.g.,  »A+  or  4  A,  or  A~  or  •  a  A,  etc.  Without  loss  of  generality  we 
can  assume  the  first  case  (since  it  either  implies  or  is  dual  to  the  other  cases).  By  appealing 
to  the  i.h.  we  obtain  a  ucd  of  T,  A+  b  ff,  which  yields  T  b  »A  by  an  admissible  inference. 


rm  b  Am  v  Bm  pm,  Am  b  jm  rm,Bmbjm 

Case  VE:  The  proof  ends  in  Tm  b  Jm  .  Since  ( A  ©  B)m  = 

Am  V  Bm,  by  (2)  there  is  a  ucd  of  T  b  A  ©  B,  and  hence  some  A  lb  A  ©  B  such  that  T  b  A. 
Since  A  lb  ,4  ©  /i  iff  A  lh  ,4  or  A  lh  B,  we  obtain  the  desired  result  by  first  applying  pattern 
substitution  on  one  of  the  other  two  premises,  and  then  applying  composition. 


□ 


Corollary  3.5.3.  For  purely  positive  A,  strong  entailment  coincides  with  minimal  entailment,  i.e.,  A  <+ 
B  iff  \A\  D  \B\  is  a  minimal  theorem. 

Corollary  3.5.4.  For  purely  negative  A,  strong  entailment  coincides  with  minimal  entailment,  i.e.,  A  <+ 
B  iff  \A\  D  \B\  is  a  minimal  theorem. 

Now,  our  general  recipe  for  building  sound  and  complete  double-negation  translations  of  clas¬ 
sical  logic  into  minimal  logic  can  be  represented  by  a  diagram: 

polarized  logic 

classical  logic  _ _  minimal  logic 

Starting  from  a  classical  theorem  b,  we  translate  it  to  a  polarized  formula  b*  and  apply  the 
completeness  of  focusing  to  obtain  an  appropriate  canonical  derivation.  Call  this  step  (— )*.  In 
the  next  step  (— )m,  we  apply  soundness  of  the  embedding  into  minimal  logic.  This  establishes 
that  the  translation  (— )m  o  (— )*  is  complete.6  Conversely,  it  is  sound  simply  because  minimal 
logic  is  included  in  classical  logic,  and  because  (— )mo(— )*  carries  b  to  a  formula  that  is  classically 
equivalent. 

By  defining  different  polarizations  of  the  classical  connectives,  we  can  reconstruct  different 
double-negation  translations.  For  example,  there  are  two  particularly  obvious  ways  to  polarize 
a  formula:  into  the  purely  positive  fragment  of  PPL,  or  into  the  purely  negative  fragment. 
The  former  gives  us  Glivenko's  famous  theorem,  while  the  latter  gives  us  a  simple  but  lesser- 
known  double-negation  translation  due  to  Lafont,  Reus,  and  Streicher  [1993],  recently  used 
by  Streicher  and  Kohlenbach  [2007]  and  (independently)  by  Avigad  [2006]  to  connect  Godel's 
Dialectica  translation  to  the  variant  by  Schoen  field. 

6It  is  important  to  distinguish  the  action  of  (— )*  and  (— )m  on  classical/PPL  formulas  from  their  action  on  classical 
proofs  and  canonical  derivations.  By  (— )m  o  (— )*,  we  mean  the  latter.  In  general,  the  translation  (— )m  °  (— )*  will 
carry  6  to  a  formula  that  may  be  different  from  b*"1  by  one  or  two  negations. 

7The  paper  by  Lafont,  Reus,  and  Streicher  gives  credit  to  Krivine  [1990]  and  Girard  [1991a]  for  the  inspiration 
for  the  translation,  although  the  connection  is  not  completely  obvious.  The  simple  version  of  the  translation  given 
below  (Theorem  3.5.6)  for  the  full  suite  of  propositional  connectives  only  appears  in  the  recent  papers  of  Streicher 
and  Kohlenbach  [2007]  and  Avigad  [2006]. 
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Theorem  3.5.5  (Glivenko  [1929]).  b  is  a  classical  theorem  iff  ~  ~  b  is  a  minimal  theorem. 

Proof.  Take  (— )*  to  be  the  purely  positive  polarization: 

X*  =  X  +  T*  =  1  F*  =  0 

(o  A  b)*  =  a*  <g>  b*  (a  V  b)*  =  a*  ©  b*  (~a)*  =  ha* 

We  can  easily  verify  that  b*  is  a  polarization  of  b,  and  that  b*m+  =  b.  Assume  b  is  a  classical 
theorem.  We  apply  Corollary  3.4.11(1)  to  obtain  a  ucd  of  •&*  b  ff,  and  Theorem  3.5.1  to  obtain 
a  minimal  logic  proof  of  ~  b*m+  h  ff.  Hence  ~  ~  b  is  a  minimal  logic  theorem.  In  the  backward 
direction,  we  use  the  inclusion  of  minimal  logic  into  classical  logic,  and  that  ~  ~  b  and  b  are 
classically  equivalent.  □ 

Theorem  3.5.6  (Streicher  and  Kohlenbach  [2007],  Avigad  [2006]).  Let  b'  be  defined  as  follozvs: 

X'  =  ~X  T'  =  F  F'  =  T 

(a  A  bf  =  a'  V  b'  (a  V  b)'  =  a'  A  b'  (~  a)'  =  ~  a' 

Then  b  is  a  classical  theorem  iff  ~  b'  is  a  minimal  theorem. 

Proof.  Take  (— )*  to  be  the  purely  negative  polarization: 

X*  =  X~  T*  =  T  F*  =  _L 

(a  A  b)*  =  a*&b*  {aW  b)*  =  a*^b*  (~a)*  =  ^a* 

We  can  easily  verify  that  b*  is  a  polarization  of  b,  and  that  b*m~  =  //.  Assume  b  is  a  classical 
theorem.  We  apply  Corollary  3.4.11(2)  to  obtain  a  ucd  of  •  h  b* ,  and  Theorem  3.5.1  to  obtain 
a  minimal  logic  proof  of  •  b  ~  b*rn- .  Hence  ~  b'  is  a  minimal  logic  theorem.  In  the  backward 
direction,  we  use  the  inclusion  of  minimal  logic  into  classical  logic,  and  that  ~  b'  and  b  are 
classically  equivalent.  □ 

The  well-known  Godel-Gentzen  translation  corresponds  to  a  slightly  more  involved  polarization. 

Theorem  3.5.7  (Godel  [1932],  Gentzen  [1936]).  Let  bG  be  defined  as  follozvs: 

XG  =  ~  ~  X  Tg  =  T  Fg  =  ~T 

(AA5)g  =  4gABg  (A  V  B)g  =  ~(~  AG  A  ~  Bg)  (~  A)g  =  ~  AG 

Then  b  is  a  classical  theorem  iff  bG  is  a  minimal  theorem. 

Proof.  Consider  the  following  polarization: 

x*  =  It^+  T*  =  1  F*  =  |_L 

(a  A  b)*  =  a*  <g>  b*  (a  V  b)*  =  |6*)  (~a)*  =  ha* 

We  can  easily  verify  b*m  =  bG.  And  again,  we  can  immediately  see  that  b*  is  indeed  a  polarization 
of  b,  and  that  if  b  is  a  classical  theorem  then  there  is  a  ucd  of  »b*  b  ff.  But  then  the  embedding 
into  minimal  logic  only  lets  us  conclude  ~  ~  //',  rather  than  hG.  If  we  could  strengthen  •  6*b# 
to  •  b  b* ,  we  would  obtain  our  result.  But  what  justifies  that  step? 

Definition  3.5.8.  We  say  that  A  is  tagless  if  there  is  exactly  one  A-pattern,  and  if  that  pattern's  frame 
only  has  hypotheses  of  the  form  »B+  or  C~. 
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For  example,  [A  ®  [B  is  tagless,  but  [A  ©  [B  and  X+  <g>  Y+  are  not. 

Lemma  3.5.9.  Let  A  be  tagless.  If  there  is  a  ucd  ofT,  »A+  F  #  (resp.  T,  A ~  F  #)  then  there  is  a  ucd  of 
T  F  A+  (resp.  T  F  *A~). 

Proof.  Assume  A  is  positive  (the  negative  case  is  dual).  Let  A  be  the  frame  of  the  unique  A 
proof-pattern.  To  derive  T  F  A+,  we  must  show  T  F  A,  which  reduces  to  showing  T  F  »B+  and 
T  F  C~  for  every  hypothesis  »B~  G  A  and  C~  G  A  (there  are  no  other  forms  of  hypothesis  in 
A).  Consider  the  case  of  a  positive  hypothesis  »B+  G  A  (the  negative  case  is  dual).  To  derive 
T  F  »B+,  we  must  show  that  T.  A'  F  #  for  every  A'  IF  B  .  Now,  by  assumption  we  have  that 
T,  »A+  F  #,  which  we  can  weaken  to  T,  A',  »A+  F  #.  So,  it  suffices  to  show  that  T,  A'  F  »A+  and 
apply  composition.  But  since  A  is  tagless,  this  reduces  to  showing  T,  A',  A  F  #  for  A  as  above. 
We  complete  the  derivation  like  so: 

p'  Id 

A' IF  B+  r,A',AF  A' 

•B+  G  A  T,  A',AF  B+ 

f.a'af# 

□ 

Therefore  we  can  finish  the  proof  of  completeness  of  the  Godel-Gentzen  translation  by  observing 
that  the  polarization  b*  defined  above  is  always  tagless.  As  usual,  soundness  of  the  translation 
is  obvious  because  bG  is  classically  equivalent  to  b.  □ 


3.6  Related  Work 

Most  of  the  results  of  this  section  about  focusing  proofs  have  appeared  in  some  form  or  another 
in  prior  work,  at  least  in  spirit — only  the  presentation  in  terms  of  canonical  derivations  is  new, 
and  the  connection  to  minimal  logic  perhaps  made  clearer.  The  original  motivation  for  intro¬ 
ducing  polarities  [Girard,  1991a,  1993]  was  to  better  understand  the  classical  double-negation 
translations,  and  Danos,  Joinet,  and  Schellinx  [1997]  explored  this  view  in  depth,  defining  a  po¬ 
larized  sequent  calculus  LK^,  and  studying  the  behavior  of  alternate  polarizations  of  classical 
proofs.  Laurent  [2002]  gave  a  similar  analysis  in  his  dissertation.  Our  approach  was  only  a  slight 
twist,  since  we  combine  polarization  with  focusing,  and  so  can  view  the  completeness  theorem 
for  focusing  as  implying  the  completeness  of  different  double-negation  translations,  by  way  of 
the  embedding  into  minimal  logic. 

Since  Andreoli's  original  work,  focusing  has  been  used  very  successfully  as  a  proof  search 
procedure  in  automated  theorem  provers,  both  for  linear  and  intuitionistic  logic  [Howe,  1998, 
Chaudhuri,  2006,  McLaughlin  and  Pfenning,  2008]. 
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Chapter  4 

Proofs  as  programs 


Purely  applicative  languages  are  often  said  to  be  based  on  a  logical  system  called 
the  lambda  calculus,  or  even  to  be  "syntactically  sugared"  versions  of  the  lambda 
calculus. . .  However,  as  we  will  see,  although  an  un sugared  applicative  language  is 
syntactically  equivalent  to  the  lambda  calculus,  there  is  a  subtle  semantic  differ¬ 
ence.  Essentially,  the  "real"  lambda  calculus  implies  a  different  "order  of  application" 

(i.e.,  normal-order  evaluation)  than  most  applicative  programming  languages. 

— John  Reynolds  [1972] 

In  this  chapter  I  will  go  back  to  our  motivating  analogy,  and  explain  how  to  read  derivations 
in  polarized  logic  as  a  programming  language.  As  it  turns  out,  the  different  forms  of  logical  in¬ 
ference  correspond  directly  to  syntactic  categories  already  familiar  from  the  theory  of  functional 
programming  languages.  For  example,  a  proof  pattern  corresponds  to  what  we  ordinarily  think 
of  as  "pattern"  in  functional  programming,  i.e.,  a  tree  of  constructors,  with  variables  at  the  leaves. 
A  proof  of  a  positive  proposition  corresponds  to  a  value  under  eager  semantics,  decomposed  as 
a  pattern  and  a  substitution  for  the  pattern's  variables.  A  refutation  of  a  positive  proposition 
corresponds  to  a  call-by-value  continuation,  defined  by  a  map  from  patterns  to  expressions,  i.e., 
by  "pattern-matching".  The  negative  story  is  dual  and  somewhat  less  intuitive,  but  corresponds 
closely  with  (and  gives  a  more  refined  analysis  of)  call-by-name  and  "lazy"  evaluation. 

These  concepts  are  familiar  from  the  semantics  of  programming  languages,  but  by  applying 
the  Curry-Howard  correspondence  to  derivations  of  polarized  logic  we  reconstruct  them  as 
syntax.  Perhaps  the  overarching  lesson  to  draw  from  this  is  that  syntax  and  semantics  should 
not  always  be  treated  independently,  because  there  are  interactions  going  both  ways.  A  historical 
example  of  this  interplay  is  the  continuation-passing-style  transform,  a  syntactic  transformation 
which,  as  Reynolds  [1972]  and  Plotkin  [1975]  observed,  determines  an  evaluation  strategy  for  a 
program.  One  way  to  understand  the  results  of  this  section  is  that  the  language  we  extract  from 
polarized  logic  intrinsically  enforces  continuation-passing-style,  and  just  as  we  saw  that  different 
polarizations  correspond  to  different  double-negation  translations  of  classical  logic,  we  can  see 
the  polarity  of  a  type  as  corresponding  to  a  choice  of  different  (call-by-value  or  call-by-name) 
CPS  translations  of  the  A-calculus.  From  the  composition  principles  for  canonical  derivations  (or 
equivalently,  from  the  cut-admissibility  theorem  for  focusing  proofs),  we  extract  a  procedure  for 
evaluating  programs  that  is  entirely  deterministic — the  answer  to  the  question  of  eager  vs.  lazy 
evaluation  is  encoded  in  types,  rather  than  being  a  global  property  of  the  language. 
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The  first  part  of  this  chapter  is  just  an  exercise  in  transliteration,  showing  how  to  annotate 
logical  derivations  with  a  syntax  of  terms,  and  then  expressing  the  identity  and  composition 
principles  on  this  syntax.  Although  we  are  merely  reformulating  the  definitions  and  results  of 
Chapter  2,  our  aim  here  is  to  build  up  a  programming  language  and  some  operational  intuitions 
behind  it.  We  also  define  a  simple  notion  of  syntactic  or  definitional  equality  on  these  terms,  and 
show  that  the  terms  corresponding  to  the  identity  and  composition  principles  deserve  the  name 
(i.e.,  they  satisfy  suitably  formulated  unit  and  associativity  properties). 

Once  we  have  this  core  language,  we  can  then  go  on  to  consider  richer  notions  of  compu¬ 
tation.  In  particular,  we  will  consider  extensions  of  the  language  with  different  kinds  of  effects. 
To  begin  we  include  only  two  very  simple  effects,  Q  (the  diverging  computation)  and  O  (the 
aborting  computation),  which  have  analogues  in  Girard's  ludics  [2001].  Since  closed  programs 
can  now  yield  two  different  observable  results,  it  makes  sense  to  consider  a  notion  of  observa¬ 
tional  equivalence.  We  state  this  in  terms  of  an  environment  semantics  for  evaluating  closed 
programs,  which  logically  corresponds  to  a  procedure  for  eliminating  multiple  cuts  embedded 
within  a  proof,  rather  than  just  a  single  cut.  We  consider  the  relationship  between  definitional 
equality  and  observational  equivalence,  proving  the  perhaps  surprising  theorem  that  in  the  pres¬ 
ence  of  the  two  effects  Q  and  15  plus  a  counter,  any  two  syntactically  distinct  terms  in  the  core 
language  can  be  observationally  distinguished.  This  generalizes  a  result  by  Girard,  who  showed 
that  to  distinguish  two  affine  terms  (more  literally:  ludics'  "designs"),  the  effects  15  and  12  are 
enough — and  it  gives  us  some  confidence  that  we  really  do  have  a  canonical  notion  of  syntax. 

We  also  briefly  discuss  an  untyped  version  of  the  language,  and  its  relationship  to  the  typed 
version.  As  with  the  usual  A-calculus,  we  can  think  of  the  untyped  language  as  really  uni- 
typed  (or  actually  bi-typed,  if  we  include  both  positive  and  negative  polarities).  In  our  case, 
though,  the  correspondence  is  more  direct  than  usual,  because  whereas  the  usual  translation 
from  untyped  to  typed  syntax  involves  addition  of  coercions,  here  the  coercions  were  already 
present,  mediating  between  the  different  syntactic  categories. 

Finally,  we  conclude  the  chapter  by  making  precise  the  correspondence  between  polarization 
and  continuation-passing  style,  showing  how  both  call-by-value  and  call-by-name  A-calculus  are 
embedded  within  our  language  C,  as  a  particular  mode  of  programming. 


4.1  Type-free  notations  for  typeful  derivations 

Before  beginning  we  should  say  just  a  few  more  words  about  what  we  are  hoping  to  accomplish. 
At  least  before  we  extend  the  language  with  effects,  there  is  a  technical  sense  in  which  we  won't 
do  anything  we  haven't  already  done  in  Chapter  2,  because  we  view  programs  as  literally  the 
same  thing  as  derivations  in  polarized  logic.  Reynolds  [2000]  calls  this  equation  of  logical 
derivations  with  programs  an  intrinsic  definition  of  a  typed  language,  because  it  only  assigns 
meaning  to  well-typed  programs.  This  is  also  sometimes  called  a  "Church"  view  of  typing, 
since  it  matches  Alonzo  Church's  formulation  of  the  simply-typed  A-calculus  [1940].  On  the 
other  hand,  an  extrinsic  definition  begins  with  a  raw,  untyped  syntax,  and  then  defines  types  as 
properties  of  these  untyped  terms,  preserved  by  operations  such  as  reduction.  This  is  sometimes 
called  a  "Curry"  view  of  typing,  after  the  work  of  Haskell  Curry  [Curry  and  Feys,  1958]. 

In  programming  languages  theory,  there  is  often  a  bias  towards  the  extrinsic  interpretation, 
because  it  accords  with  the  intuition  that  programs  carry  a  computational  content  irrespective 
of  their  type.  In  fact,  a  crucial  property  of  strongly-typed  functional  programming  languages 
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like  ML  and  Haskell  is  that  they  can  be  given  a  type-free  operational  semantics,  i.e.,  programs 
can  be  evaluated  without  any  run-time  type  information.  But  this  is  not  inconsistent  with  an 
intrinsic  view  of  typing.  While  the  intrinsic  approach  equates  terms  with  derivations  involving 
types,  it  is  not  the  case  that  terms  have  to  mention  these  types.  So  our  first  goal  in  this  chapter 
is  to  develop  different  type-free  notations  for  typeful  derivations. 

Consider  an  analogy  from  linguistics.  Suppose  we  have  a  context-free  grammar  for  a  (fairly 
small)  subset  of  English: 

1  :  S  — >  NP  VP 

2  :  VP  — >  V  NP 

3  :  NP  — >  John 

4  :  NP  ->  Mary 

5  :  V  — » loves 

To  show  that  "John  loves  Mary"  is  a  well-formed  sentence,  we  can  exhibit  the  following  parse 
tree: 


NP  VP 

John  V  NP 

I 

loves  Mary 


The  parse  tree  doesn't  say  which  grammar  rules  are  being  applied  at  each  node,  but  we  can 
indicate  them  with  a  more  explicit  tree  (written  upside-down  for  convenience,  in  inference-rule 
notation,  and  where  (•)  represents  concatenation): 


John 

NP 


loves  MaW 
V  W  NP 


(3) 

NP  VP 
S 


V  NP 
VP 


(2) 


(1) 


(•) 


(4) 

(•) 


But  now  the  terminals  and  non-terminals  annotating  the  nodes  of  the  tree  are  redundant,  because 
they  can  be  reconstructed  from  the  rules.  We  may  as  well  write  the  derivation  like  so: 


5  4 

3 _ 2 

1 


or  linearly  as  the  expression  1(3,  2(5, 4)).  This  representation  is  type-free,  even  though  we  can 
mechanically  reconstruct  all  of  the  types  from  the  definition  of  the  grammar. 

Although  our  canonical  derivations  have  more  complex  structure  than  parse  trees,  we  will 
follow  more  or  less  the  same  procedure  to  come  up  with  a  type-free  notation.  In  the  next  sections 
we  will  explain  how  to  build  a  type-free  notation  for  arbitrary  derivations,  while  still  taking  this 
as  an  intrinsic  definition  of  a  programming  language.  We  will  then  show  how  to  express  the 
composition  principles  for  canonical  derivations  directly  on  this  type-free  notation,  validating  the 
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intuition  that  the  run-time  execution  of  programs  does  not  require  type  information,  although 
we  can  still  use  types  to  reason  about  this  execution.  (This  will  not  be  the  end  of  our  take  on 
the  Curry  vs.  Church  debate,  however.  In  Chapter  6,  we  will  explain  how  to  define  extrinsic 
refinement  types  that  further  classify  intrinsically  well-typed  programs.) 


4.2  C+ :  A  call-by-value  language 

Just  as  we  started  Chapter  2  by  describing  a  proof-biased  approach  to  logic — and  later  explained 
how  to  interpret  it  as  a  fragment  of  a  larger,  polarized  logic — we  will  start  this  chapter  by  describ¬ 
ing  a  "value-biased",  or  call-by-value  language  C+ ,  and  then  show  how  to  view  it  as  a  fragment 
of  a  larger  language  C  that  freely  mixes  call-by-value  and  call-by-name  evaluation. 

4.2.1  Continuation  frames  and  value  patterns 

We  begin  by  recalling  some  of  the  definitions  of  §2.1.1,  and  recasting  them  in  Curry-Howard 
terms.  Rather  than  speaking  of  positive  propositions,  we  will  now  speak  of  positive  types 
A+,B+,C+.  As  in  previous  chapters,  we  only  write  the  polarity  marker  (— )+  for  emphasis, 
and  sometimes  omit  it  to  relax  the  notation.  Recall  that  in  this  fragment,  frames  A  consist  of 
lists  of  refutation  holes  «Aj. . . . ,  •Af.  We  now  call  these  continuation  holes. 

Likewise,  whereas  we  previously  referred  to  derivations  of  A  lb  A  1  as  proof  patterns,  we  now 
call  them  value  patterns.  Recall  the  rules  for  deriving  A  lb  A+: 

•A+  lb  A  A 

Ai  lb  A  A2  lb  B 
•  lb  1  Ai,  A2  lb  A  <g>  B 

A  lb  A  A  lb  B 

A  lb  A  ©  L?  A  lb  A  ©  £>  (no  rule  for  0) 

Now  we  assign  labels  to  the  rules: 

•A+  lb  A  A  ~ 

Ai  lb  A  A2  lb  B  . 

•  lb  1  ^  Ai,A2  lb  A®B  pa'r 

A  lb  A  .  Alb  B  . 

A  lb  A®B  ml  A  lb  A®B  Inr 

These  labels  give  us  a  type-free  notation  for  patterns.  For  example,  pair(_,  ini  _)  and  pair(_,  inr_) 
are  two  -i A  <g)  (-> B  ©  -iC)-patterns  with  frames  «A.  %B  and  «A.  »C,  respectively.  Since  pair- 
patterns  are  used  pretty  frequently,  we  write  (pi,p2)  as  shorthand  for  pair(pi,p2).  We  also 
implicitly  associate  unary  pattern  constructors  to  the  right,  so  for  example  we  can  write  ini  ini  _ 
as  shorthand  for  ini  (ini _ ) . 

As  in  Chapter  2,  we  will  use  patterns  in  a  fairly  open-ended  way,  without  relying  on  the 
properties  of  particular  connectives  except  to  construct  particular  examples.  Here  we  can  intro¬ 
duce  some  additional  types  that  didn't  really  have  interesting  logical  counterparts  in  Chapter  2. 
For  example,  we  define  patterns  for  booleans  and  natural  numbers: 
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•  lb  B 


tt 


•  lb  B 


ff 


lb  N 


A  lb  N 
A  lb  N 


as  well  as  patterns  for  a  paradoxical  type  D: 


A  lb  N 
A  lb  D 


dn 


A  lb  ^  D 
A  lb  D 


The  type  D  is  meant  to  include  both  natural  numbers  and  D-continuations.  Note  that  this  is 
a  valid  inductive  definition  of  D-patterns,  but  that  D  itself  is  not  well-founded  in  the  sense  of 
the  definition  ordering  (§2.1.2),  since  D  -<  D.  Later,  we  will  use  this  type  to  construct  non¬ 
terminating  programs. 

Finally,  let's  observe  that  value  patterns  really  do  correspond  to  patterns  in  the  usual  func¬ 
tional  programming  sense,  with  the  proviso  that  the  latter  go  "as  deep  as  possible",  i.e.,  up  to 
continuations.  When  we  want  to  emphasize  this  proviso,  we  say  that  value  patterns  are  maxi¬ 
mal.  One  consequence  of  maximality,  e.g.,  is  that  N-patterns  are  in  one-to-one  correspondence 
with  numerals. 

Notation.  We  write  n  as  syntactic  sugar  for  the  N-pattern  corresponding  to  the  nth  numeral,  e.g.,  0  =  z, 
l  =  sz,  2  =  ssz,  etc.  Note  that  every  N-pattern  has  an  empty  frame,  i.e.,  has  no  holes  for  continuations. 


4.2.2  Annotated  frames,  contexts  and  binding 

Of  course  the  reader  may  object,  "But  patterns  in  functional  programming  languages  bind  vari¬ 
ables!"  Well  patterns  in  C+  do  as  well,  but  we  have  to  think  a  bit  carefully  about  what  variables 
are.  Recall,  we  gave  simple  inductive  rules  for  deciding  containment  Ai  e  A2: 

A  G  Ai  A  e  A2 

AeA  A  <E  Ai,  A2  A  e  Ai,  A2 

Now,  suppose  we  assign  labels  to  these  rules: 

,  A  e  Ai  ,  r  A  e  A2 

AeA  here  A  6  Ai,A2  left  A  e  Ai,  A2  ngh 

Then  a  derivation  of  a  containment  relationship  Ai  e  A2  is  a  sequence  of  these  constructors 
defining  a  particular  path  through  A2,  and  can  be  seen  as  a  "de  Bruijn  index"  for  a  variable 
[de  Bruijn,  1972],  The  way  we  associate  A2  matters  for  how  we  build  this  index.  For  example, 
the  derivation  of  »B  e  (•A,  (•/i.  •Cf) — the  index  of  a  continuation  variable — would  be  annotated 
right  left  here. 

However,  programming  in  this  style  becomes  tedious  very  quickly.  We  would  rather  have  ac¬ 
tual  names  to  refer  to  the  continuation  holes  inside  of  frames,  and  not  care  about  the  associativity 
of  frames.  So,  let  us  introduce  some  new  notation. 

Definition  4.2.1  (Annotated  frames).  An  annotated  frame  is  a  frame  with  labelled  leaves,  subject  to 
the  usual  variables  conventions:  we  assume  all  of  the  labels  within  an  annotated  frame  are  disjoint,  and 
zve  can  freely  a-convert  labels.  In  particular,  we  write  an  annotated  frame  of  continuation  variables  as 
k\  :  . ...  Kn  :  •An,  where  the  are  distinct. 

Now,  we  can  give  an  alternative  notation  for  patterns,  using  annotated  frames.  In  particular,  we 
annotate  the  negation  pattern-rule  with  a  continuation  variable: 
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K  :  •A+  lh  A 

The  other  rules  remain  the  same,  but  we  can  reinterpret  them  as  operating  on  annotated  frames. 
For  example,  consider  the  pair  pattern-rule: 

Ai  lh  A  A2  lh  B  . 

Ai,A2  lh  A®B  pa'r 

Because  of  the  requirement  that  the  labels  in  an  annotated  frame  are  distinct,  the  rule  now 
has  an  implicit  side-condition,  that  the  two  subpatterns  do  not  repeat  variable  names.  (Note 
that  this  corresponds  to  the  usual  linearity  restriction  on  pattern-matching  in  most  functional 
programming  languages.) 

To  give  an  example,  the  two  ->A  <g>  (~>B  ©  -iC')-patterns  we  wrote  above  could  be  annotated 
with  explicit  variable  names  as  pair(/«i,  ini  k2)  and  pair(«i,  inr  k2).  Their  annotated  frames  are 
:  »A,  K2  ■  and  K\  :  »A,  k2  :  •C. 

When  defining  the  terms  of  C+  below,  we  will  work  with  both  annotated  and  unannotated 
frames.  However,  a  context  is  always  a  list  of  annotated  frames.  That  is,  a  term  inside  a  context 
can  mention  variables,  which  are  bound  by  the  context.  (This  is  one  reason  why  it  is  useful  to 
conceptually  distinguish  frames  and  contexts.) 

4.2.3  Values,  continuations,  substitutions,  expressions 

With  a  definition  of  patterns  (and  frames)  in  hand,  we  can  now  "make  a  language  for  ourselves". 
Recall  the  four  hypothetical  judgments: 

T  h  A  T  h  »A 
T  h  A  Th# 

and  the  rules  for  building  canonical  derivations  of  these  judgments  within  the  positive,  atom- 
free,  simple  fragment  of  PPL: 

A  lh  A+  T  h  A  A  lh  A+  — >  T,Ah# 

T  h  y4+  T  h  »A+ 

Th  Ai  Th  A2  %A+  e  T  Thd+ 

Th-  Th(A1,A2)  Th# 

We  now  interpret  these  derivations  as  terms  of  a  programming  language. 

Definition  4.2.2  (Terms).  A  derivation  t ::  (T  h  J)  is  called  a  term.  Terms  may  be  further  classified  as 
follows: 

•  A  value  is  a  derivation  V  ::  (Th  A) 

•  A  continuation  is  a  derivation  K  ::  (T  h  »A) 

•  A  substitution  is  a  derivation  a  ::  (T  h  A) 

•  An  expression  is  a  derivation  E  ::  (T  h  ff) 
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Sometimes  we  say  "-4-value",  "^-continuation",  etc.,  for  added  specificity.  Again,  the  interesting 
part  is  the  type-free  notation  for  terms,  and  the  operational  intuition  that  goes  along  with  it.  We 
examine  the  rules  for  each  judgment  in  turn: 

•  A+:  This  judgment  has  a  single  introduction  rule: 

P  a 

A  IS  A+  r  F  A 

r  h  a+ 

The  rule  combines  a  value  pattern  p  and  a  substitution  a  for  the  frame  of  the  pattern,  and 
we  write  the  result  as  p[o].  If  we  think  in  terms  of  annotated  frames,  the  intuition  behind 
this  rule  is  expressed  by  the  slogan, 

a  value  is  a  pattern  under  a  substitution 

This  holds  intuitively  in  call-by-value  languages  like  ML,  by  a  fairly  trivial  "factorization 
lemma".  For  example,  the  ML  value 

(fn  x  =>  x*x,  fn  x  =>  x-3) 

can  be  factored  as  the  pattern 

(f  ,g) 

composed  with  the  substitution 

let  val  f  =  fn  x  =>  x*x 
val  g  =  fn  x  =>  x-3 
in 

On  the  other  hand,  here  we  really  only  care  about  the  structure  of  the  pattern,  and  not  the 
variable  names,  so  we  should  read  the  rule  as  taking  an  unannotated  frame.  Either  way, 
the  utility  of  this  factorization  is  that  values  are  given  a  uniform  representation,  which  we 
will  exploit  to  great  effect  when  defining  the  operational  semantics  of  C+ . 

•  »A+:  This  judgment  has  a  single  introduction  rule: 

p  Ep 

A  IF  A+  — >  T,Ah# 

T  P  »A+ 


Encoded  in  the  rule  is  the  slogan  that 

a  continuation  is  a  map  from  patterns  to  expressions 

This  slogan  has  some  deep  syntactic  and  semantic  ramifications,  so  let's  explore  them 
carefully.  First,  the  rule  gives  us  license  to  define  continuations  by  pattern-matching,  just 
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as  we  do  in  languages  like  ML.  For  example,  we  build  a  N-continuation  K  by  listing  the 
cases 


K  z 

—  Eq 

K  sz 

=  E\ 

K  ssz 

=  E2 

which  we  might  also  express  more  concisely  as 

K  n  =  En 

A  D-continuation  might  be  defined  by  the  map 

K  d  n  n  =  En 
K  d  k  k  =  Ek 

where  Ek  can  use  the  bound  continuation  variable  k,  relying  on  the  annotated  view  of 
frames. 

The  fact  that  the  rule  quantifies  over  all  -4-patterns  builds  in  the  typical  side-condition 
(usually  checked  but  not  always  enforced)  that  pattern-matching  is  exhaustive.  It  is  worth 
noting  that  although  the  practical  benefits  of  pattern-matching  notation  are  frequently 
recognized,  it  is  often  seen  as  a  matter  of  surface  syntax,  as  "syntactic  sugar"  for  more 
verbose  elimination  constructs  that  explain  what  is  really  going  on  at  a  deeper  level.  We 
are  taking  the  opposite  view  by  treating  pattern-matching  as  a  primitive  notion  in  syntax. 

Second,  the  fact  that  continuations  are  defined  by  maximal  pattern-matching  has  the  conse¬ 
quence  that  they  are  strict.  For  example,  to  define  a  D-continuation,  we  define  its  behavior 
on  any  value  matching  one  of  the  patterns  dnz,  dnsz,  dnssz,  etc.,  or  dk k.  In  particular,  we 
leave  undefined  its  behavior  on  a  diverging  computation  of  a  D,  or  on  a  computation  that 
gives  dn  as  an  outermost  constructor  but  then  diverges  while  computing  a  N. 

Finally,  the  slogan  conveys  that  a  continuation  is  nothing  more  than  a  map  from  patterns 
to  expressions.  Starting  from  this  definition,  we  cannot  help  but  treat  continuations  exten- 
sionally,  by  their  behavior  on  (maximal)  value  patterns.  We  do  not  care  about  the  details 
of  how  these  maps  are  defined.  As  we  will  see,  this  extensional  view  dramatically  simpli¬ 
fies  the  presentation  of  the  operational  semantics  of  our  programming  language  relative  to 
typical  presentations  for  comparable  languages,  and  also  forces  our  hand  in  defining  the 
right  notion  of  equality  on  continuations. 

•  A:  We  build  a  substitution  simply  by  concatenating  other  substitutions  (which  ultimately 
are  composed  of  continuations): 


<7 1  (72 

r  F  Ai  r  F  A2 
fT  rF(Ai,A2) 
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Because  we  can  treat  frames  modulo  associativity  and  unit  (relying  on  the  fact  that  the 
leaves  of  the  frame  are  referenced  by  labels  rather  than  paths),  we  can  correspondingly 
think  of  these  substitution  constructors  as  monoidal  operations,  i.e.,  the  concatenation  of 
two  substitutions  (<7i,  02)  is  an  associative  operator,  and  the  empty  substitution  •  is  its  unit. 
So,  any  substitution  can  simply  be  written  as  a  list  of  continuations,  o  =  (K\ , . . . ,  Kn ) .  This 
is  for  the  unannotated  view  of  frames.  The  annotated  view  associates  each  continuation 
with  a  continuation  variable,  a  =  (Ki/ ,  Kn/ nn).  But  these  two  views  are  freely 
interchangeable,  by  the  following  easy  observation: 

Observation  4.2.3.  Any  substitution  (K\, . . . ,  I\n)  ::  (T  b  (»Ai, . . . ,  •An))  determines  a  map 
from  continuation  variables  k  :  »A  e  (ki  :  •Ai,...,nn  :  • An )  to  continuations  K  ::  (T  b  »A), 
and  conversely,  given  such  a  map  zve  can  build  a  substitution.  We  write  a(n)  for  this  action  of  a 
A- substitution  on  a  continuation  variable  in  (an  annotation  of)  A. 

•  There  is  a  single  rule  for  establishing  contradiction  in  the  positive  fragment  of  PPL: 

1/ 

k  :  »A+  6  T  Tbi+ 

Tb# 


This  rule  builds  an  expression  k  V  by  pairing  an  .4- value  with  an  ^-continuation  variable. 
Intuitively,  in  terms  of  conventional  operational  semantics,  the  expression  k  V  is  interpreted 
as  passing  the  value  V  to  the  continuation  denoted  by  k  at  runtime.  This  is  what  Reynolds 
[1972]  calls  a  "serious  expression",  because  the  resulting  computation  might  do  "serious" 
things  (for  example,  diverge),  according  to  n's  whim.  Of  course,  in  our  language  this  is  as 
yet  only  a  metaphor  for  what  k  could  do  in  potential — we  have  not  yet  explained  how  to 
write  programs  that  diverge,  or  that  do  anything  interesting  for  that  matter. 


Before  discussing  the  properties  of  C+  in  detail,  let  us  make  a  few  easy  observations,  and  examine 
a  few  simple  programs. 

Proposition  4.2.4  (Weakening).  If  t  ::  (T  b  J)  then  t  ::  (T(A)  b  J). 

Proof.  Trivial,  by  observing  that  our  type-free  notation  is  unaffected  by  the  addition  of  extra 
frames  into  the  context.  Note,  though,  that  we  are  relying  on  the  fact  that  we  have  actual 
variables — with  a  de  Bruijn  approach,  weakening  requires  index  "shifting".  □ 


Proposition  4.2.5.  The  value-constructing  rides 


TbA 

Tb4®T 


INL 


T  b  B 
Tbd®B 


INR 


TP  A  Tb B 
Tb  A®B 


PAIR 


are  admissible,  defined  by  the  transformations 

INL  (pH) 
INR  (p[a\) 
PAIR(pi  [a1],p2[cr2}) 


(inlp)  [<r] 

(inrp)  [<r] 
{Pi,P2)[{cri,a2)} 


Proposition  4.2.6.  More  generally,  we  can  view  standard  value-constructors  as  syntactic  sugar  for  com- 
binators  which  operate  on  patterns  and  substitutions.  Let  c  be  an  n-ary  pattern  ride: 
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Ai  lb  Ai  ...  A„  lh  An 


c 


Ai, . . . ,  An  lh  B 

Then  the  value-constructing  rule 

r^i  ...  rh An 
r  h  b  L 

is  admissible,  defined  by  the  operation  C(pi[<xi], . . . , pn [on] )  =  c(pi, . . .  ,pn)[(cr  1,  •  ■  ■ ,  cr„)]. 

Proposition  4.2.7.  Any  A-continuation  I\  can  be  treated  as  a  A  A-value,  by  placing  it  in  a  singleton 
substitution,  i.e.,  by  building 

-[K\ 

Proposition  4.2.8.  Any  A-value  V  can  be  lifted  to  a  A  A-continuation  that  immediately  applies  its 
argument  to  V,  i.e.,  the  continuation  K  defined  by 

K  K  =  AC  V 

As  an  exercise,  the  reader  can  try  reconstructing  the  canonical  derivations  corresponding  to  each 
of  the  above  terms,  in  the  more  verbose  notation  of  Chapter  2. 

Example  4.2.9.  C  forces  us  into  writing  programs  in  continuation-passing  style.  Where  in  a 
direct  style  language  we  would  typically  define  a  function  A  B,  in  C+  we  define  a  contin¬ 
uation  transformer  from  B  continuations  to  A  continuations.  For  example,  to  represent  boolean 
conjunction,  we  can  define  aB8  B-continuation  and,,  indexed  by  a  continuation  variable  k  :  *B, 
which  takes  in  a  pair  of  booleans  and  throws  their  binary  product  to  ac.  Likewise,  we  can  de¬ 
fine  a  B-continuation  notK,  which  takes  in  a  boolean  and  throws  its  complement  to  k.  These 
continuations  are  defined  by  the  following  maps  from  value  patterns  to  expressions  (following 
Proposition  4.2.6  above,  we  write  TT  and  FF  as  syntactic  sugar  for  tt[-]  and  ff [■]  respectively): 

andK  (tt,  tt)  =  k  TT 
andK  (tt,  ff)  =  k  FF 
andK  (ff,  tt)  =  k  FF 
andK  (ff,  ff)  =  k  FF 


notK  tt  =  k  FF 
notK  ff  =  ac  TT 

Alternatively,  we  can  define  closed  continuations  that  take  the  continuation  variable  ac  as  an 
extra  component  of  the  pattern.  For  example,  we  can  define  a  closed  (B  <8>  B)  <8>  -iB-continuation 
and*: 


and*  ((tt,  tt) ,  ac)  =  ac  TT 

and*  ((tt,  ff) ,  ac)  =  ac  FF 

and*  ((ff,  tt) ,  ac)  =  ac  FF 

and*  ((ff,  ff),  ac)  =  ac  FF 
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Proposition  4.2.10.  Any  N-pattern  n  combined  with  the  empty  substitution  builds  a  N -value  under 
any  context,  that  is,  n[-\  ::  (T  h  N). 

Example  4.2.11.  We  can  define  continuation  transformers  representing  addition  and  multipli¬ 
cation  on  natural  numbers.  We  build  N  <g)  N-continuations  plusK  and  times K,  indexed  by  a 
continuation  variable  k  :  «N,  defined  by  the  following  maps  from  value  patterns  to  expressions: 

plusK(nl,n 2)  =  n  n±  +  n2[-] 
times K  (ral,  n^)  =  k  n\  x  77.2  [•] 

For  example,  plusK  (2,3)  =  n  5[-]  and  timesK  (2,3)  =  n  6[-]. 

Alternatively,  we  can  define  closed  (N  <g>  N)  <g)  -iN-continuations  plus*  and  times*  which  take 
the  continuation  variable  as  an  extra  component  of  the  pattern: 

plus*  ((nl,n^),/c)  =  u  ni  +  n2[-] 
times*  ((nj)  7X2), «)  =  k  n  1  x  712 [•] 


The  intention  of  Example  4.2.11  is  hopefully  clear — but  the  reader  might  have  doubts  about 
what  it  means  formally,  or  how  to  generalize  from  it.  Obviously,  the  definition  of  the  maps 
plusK ,  times K,  etc.,  presupposes  some  basic  arithmetic.  But  then  what,  precisely,  was  meant  when 
I  wrote  that  "a  continuation  is  a  map  from  patterns  to  expressions"?  Are  these  maps  arbitrary 
set-theoretic  functions?  Recursive  functions?  Partial  recursive?  Because  the  space  of  value 
patterns  is  infinite  for  some  types  (such  as  N  and  N  <g>  N),  these  can  all  be  different  classes 
of  functions.  For  the  simple  examples  above,  it  seems  that  defining  continuations  by  recursive 
functions  will  do.  But  will  it  suffice  in  general? 

And  there  is  another  (perhaps  overlooked)  ambiguity  in  our  definition  of  terms:  it  is  circular. 
Values  are  built  out  of  substitutions,  which  are  built  out  of  continuations,  which  are  built  out 
of  expressions,  which  are  built  out  of  values.  So  should  we  read  it  as  an  inductive  definition? 
Do  we  suppose  that  terms  are  built  out  of  finite  applications  of  these  rules?  Again,  this  was  the 
case  in  all  of  the  examples  above,  but  will  it  always  work?  Note  that  this  question  already  arose 
for  canonical  derivations  in  the  logical  setting,  but  there  we  used  the  well-foundedness  of  the 
definition  ordering  to  restrict  the  height  of  derivations. 

As  we  will  see,  in  order  to  preserve  the  identity  and  composition  principles  for  canonical 
derivations  in  this  operational  setting — in  particular  where  we  can  no  longer  rely  on  the  defini¬ 
tion  ordering  being  well-founded — we  have  to  give  negative  answers  to  both  of  these  questions. 
We  will  adopt  the  following  conventions: 

1.  Continuations  are  defined  by  partial  recursive  functions  from  value  patterns  to  expressions. 

2.  Terms  can  be  non-well-founded. 

But  these  clarifications  of  the  syntax  of  the  language  lead  to  an  obvious  next  question. 
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4.2.4  Is  this  really  syntax? 

This  somewhat  ill-formed  philosophical  question  requires  a  somewhat  ill-formed  philosophical 
answer.  So  let  me  offer  one.  Computer  science  has  a  long  a  tradition  of  building  upon  higher 
and  higher  levels  of  abstraction — both  in  terms  of  the  domain  of  research  (e.g.,  studying  basic 
tree  data  structures,  which  are  then  used  to  study  sorting  and  searching  algorithms,  which  are 
then  used  to  study  process  scheduling,  etc.),  and  in  terms  of  the  construction  of  the  computer 
itself.  Olivier  Danvy  explains  the  latter  situation  well: 

Overall,  a  computer  system  is  constructed  inductively  as  a  (finite)  tower  of  inter¬ 
preters,  from  the  micro-code  all  the  way  up  to  the  graphical  user  interface.  Compilers 
and  partial  evaluators  were  invented  to  collapse  interpretive  levels  because  too  many 
levels  make  a  computer  system  impracticably  slow.  The  concept  of  meta  levels  there¬ 
fore  is  forced  on  computer  scientists:  I  cannot  make  my  program  work,  but  maybe 
the  bug  is  in  the  compiler?  Or  is  it  in  the  compiler  that  compiled  the  compiler? 
Maybe  the  misbehaviour  is  due  to  a  system  upgrade?  Do  we  need  to  reboot?  and  so 
on.  Most  of  the  time,  this  kind  of  conceptual  regression  is  daunting  even  though  it 
is  rooted  in  the  history  of  the  system  at  hand,  and  thus  necessarily  finite.1 

In  similar  fashion,  there  has  been  a  gradual  progression  towards  higher  levels  of  abstraction 
in  what  computer  scientists  view  as  legitimate  descriptions  of  the  syntax  of  a  programming 
language.  Consider  how  Alonzo  Church  began  his  definition  of  the  A-calculus: 

We  select  a  particular  list  of  symbols,  consisting  of  the  symbols  {,  },  (,  ),  A,  [,  ],  and 
an  enumerably  infinite  set  of  symbols  a,b,  c,  ...  to  be  called  variables.  And  we  define 
the  word  formula  to  mean  any  finite  sequence  of  symbols  out  of  this  list.  The  terms 
well-formed  formula,  free  variable,  and  bound  variable  are  then  defined  by  induction  as 
follows. . .  [Church,  1936] 

In  1936,  "syntax"  meant  strings,  i.e.,  finite  sequences  of  marks  on  a  piece  of  paper  or  chalkboard, 
and  Church  gives  a  rather  cumbersome  description  of  the  legal  ways  of  forming  strings  repre¬ 
senting  A-terms.  A  more  convenient  way  of  specifying  well-formed  strings  was  devised  by  John 
Backus  and  refined  by  Peter  Naur,  originally  called  Backus  Normal  Form  and  now  known  as 
Backus-Naur  Form — basically  a  notation  for  context-free  grammars.  This  allowed  viewing  syn¬ 
tax  slightly  more  abstractly,  as  derivations  in  a  context-free  grammar  (in  other  words  as  parse 
trees).  But  consider  John  McCarthy's  words  from  1963: 

The  Backus  normal  form  that  is  used  in  the  ALGOL  report,  describes  the  morphology 
of  ALGOL  programs  in  a  synthetic  manner.  Namely,  it  describes  how  the  various 
kinds  of  program  are  built  up  from  their  parts.  This  would  be  better  for  translating 
into  ALGOL  than  it  is  for  the  more  usual  problem  of  translating  from  ALGOL.  The 
form  of  syntax  we  shall  now  describe  differs  from  the  Backus  normal  form  in  two 
ways.  First,  it  is  analytic  rather  than  synthetic;  it  tells  how  to  take  a  program  apart, 
rather  than  how  to  put  it  together.  Second,  it  is  abstract  in  that  it  is  independent  of 
the  notation  used  to  represent,  say  sums,  but  only  affirms  that  they  can  be  recognized 
and  taken  apart.  [McCarthy,  1963] 

1Entry  for  "Self-interpreter",  written  by  Olivier  Danvy,  in  Appendix  A  to  [Girard,  2001]. 


74 


Most  modern  textbooks  on  compilers  and  programming  languages  follow  McCarthy  in  making 
a  distinction  between  concrete  syntax — the  well-formed  strings  of  the  programming  language, 
with  all  the  necessary  curly  braces,  semicolons,  etc.,  described  by  a  BNF  grammar — and  abstract 
syntax.  Abstract  syntax  still  represents  a  program  by  a  labelled  tree,  but  a  simpler  one,  without 
irrelevant  information  such  as,  say,  whether  statements  are  enclosed  in  square  brackets  or  round 
parentheses.  To  be  sure,  this  information  only  becomes  irrelevant  after  we  have  already  parsed 
the  program's  string  encoding  into  a  tree.  And  parsing  is  not  a  completely  trivial  problem.  But 
once  we  have  the  parse  tree,  it  becomes  a  distraction  to  have  this  extra  information  around  if  we 
want  to  do  anything  with  the  syntax,  i.e.,  give  it  semantics.  Many  textbooks  will  only  devote  a 
couple  paragraphs  to  concrete  syntax,  before  moving  on  to  abstract  syntax. 

What  I  am  proposing  is  to  take  an  even  more  abstract  view  of  syntax,  particularly  allowing 
for  computation  in  syntax.  Not  only  does  doing  so  highlight  some  neat  symmetries,  but  it  also 
makes  it  much  easier  to  describe  and  reason  about  the  operational  behavior  of  programs,  as  we 
will  see  below.  On  the  other  hand,  just  as  the  abstract  syntax  tree  representation  of  programs 
brushes  aside  some  real  issues  (namely,  parsing),  so  too  does  this  functional  representation.  Rest 
assured,  we  will  examine  some  of  these  issues  in  Chapter  5.  Suffice  it  to  say,  if  we  build  these 
higher  and  higher  levels  of  abstraction,  we  have  to  be  willing  to  compile  them  away. 

4.2.5  Equality,  operational  semantics,  and  effects:  overview 

In  his  paper  on  "Notions  of  computations  and  monads",  Eugenio  Moggi  contrasts  operational, 
denotational,  and  logical  approaches  to  reasoning  about  equivalence  of  programs: 

•  The  operational  approach  starts  from  an  operational  semantics,  e.g.,  a  partial  function  mapping  every 
program. . .  to  its  resulting  value  (if  any),  which  induces  a  congruence  relation  on  open  terms  called 
operational  equivalence. . .  Then  the  problem  is  to  prove  that  two  terms  are  operationally  equivalent. 

•  The  denotational  approach  gives  an  interpretation  of  the  (programming)  language  in  a  mathematical 
structure,  the  intended  model.  Then  the  problem  is  to  prove  that  two  terms  denote  the  same  object 
in  the  intended  model. 

•  The  logical  approach  gives  a  class  of  possible  models  for  the  (programming)  language.  Then  the 
problem  is  to  prove  that  two  terms  denotes  the  same  object  in  all  possible  models r 

After  explaining  some  of  the  shortcomings  of  the  operational  and  denotational  approaches, 
Moggi  then  goes  on  to  introduce  a  logical  approach  to  modelling  computation,  in  categories 
with  monads. 

Of  course,  Moggi's  definition  of  "logical"  is  biased  towards  the  model-theoretic  rather  than 
the  proof-theoretic  view  of  logic.  One  of  the  aims  of  the  following  sections  is  to  show  how  to 
give  the  operational  approach  a  logical  interpretation  of  the  latter  sort,  deriving  an  operational 
semantics  as  a  particular,  simple  cut-elimination  algorithm  for  canonical  derivations.  Our  other 
aim  is,  like  Moggi,  to  use  this  semantics  to  explore  different  notions  of  computational  effects. 

We  begin  by  translating  the  composition  and  identity  principles  for  canonical  derivations,  de¬ 
fined  in  Chapter  2,  to  the  notation  of  C+  terms.  These  principles  are  comparable  to  the  standard 
notions  of  /3-reduction  and  //-expansion  in  the  A-calculus,  except  insofar  as  terms  are  already  in 

2  [Moggi,  1991] 
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normal  form.  The  composition  principles  (analogous  to  iterated  /3-reduction)  are  binary  opera¬ 
tions  on  terms,  while  the  identity  principles  (analogous  to  iterated  ^-expansion)  are  particular 
terms.  In  this  sense,  the  theory  we  derive  is  similar  in  spirit  to  categorical  semantics:  although 
we  do  not  formally  interpret  terms  as  arrows  of  a  category,  we  do  show  that  the  composition 
and  identity  principles  satisfy  (suitably  formulated)  associativity  and  unit  properties. 

But  it  is  only  an  equational  theory,  rather  than  a  realistic  model  of  evaluation.  For  example, 
to  define  composition  in  general  we  must  compose  terms  suspended  within  continuations,  what 
is  sometimes  referred  to  as  "evaluation  under  a  lambda".  To  obtain  a  more  realistic  operational 
semantics,  we  investigate  a  special  case  of  composition,  iterated  a  finite  number  of  times: 

■  b  Ai  Ai  F  A2  ...  Ai, . . . ,  A„_i  h  An  Ai,...,AnH# 

■I-# 

It  happens  that  this  n- ary  composition  principle  is  easier  to  define  than  the  general,  binary 
composition  principles.  Operationally,  it  corresponds  to  executing  an  open  expression  ( E  :: 
(Ai, . . . ,  A n  F  #))  within  a  closed  environment  of  substitutions  (a,  ::  (Ai, . . . ,  A,_i  F  A,),  for 
i  =  l..n).  In  this  way,  the  operational  semantics  for  our  programming  language  arises  naturally 
from  its  logical  interpretation,  as  a  simple  cut-elimination  algorithm.  On  the  other  hand,  this 
n-ary  composition  principle  also  seems  logically  paradoxical:  it  results  in  a  derivation  of  ■  I-#/ 
i.e.,  a  closed  proof  of  contradiction! 

Another  way  of  viewing  this  situation  is  that  the  pure  language  C+ ,  derived  via  the  Curry- 
Howard  correspondence,  allows  only  one  kind  of  expression:  throwing  a  value  to  the  continua¬ 
tion  denoted  by  a  variable.  And  what  can  that  continuation  do?  Only  throw  a  value  to  another 
continuation,  which  likewise  must  throw  a  value  to  another  continuation,  and  so  on.  Thus  the 
only  way  we  can  ever  hope  to  instantiate  the  operational  semantics  is  by  building  a  program 
that  loops  forever,  the  environment  growing  indefinitely,  i.e.,  a  circular  proof  of  contradiction.  In 
other  words,  divergence  is  the  only  possible  observable  result  of  a  closed  program. 

This  fact  could  be  used  to  dismiss  the  Curry-Howard  interpretation  of  polarized  logic  as 
trivial,  or  alternatively  (the  view  I  support,  of  course)  that  it  provides  just  the  right  starting 
point,  a  clean  slate  for  investigating  richer  models  of  computation.  Thus,  the  remainder  of  the 
section  considers  how  to  extend  C+  with  additional  forms  of  observable  behavior,  or  side-effects. 
Taking  a  cue  from  Girard's  ludics,  we  begin  by  investigating  the  simplest  possible  side-effect: 
immediate  failure.  Now  equipped  with  two  forms  of  observable  results,  we  can  already  define  a 
non-trivial  notion  of  observational  equivalence,  and  relate  it  to  syntactic,  or  definitional  equality. 
One  property  proved  by  Girard  [2001],  which  may  seem  counterintuitive,  is  that  these  two  side- 
effects  are  sufficient  to  distinguish  (i.e.,  prove  observationally  distinct)  any  two  definitionally 
distinct  "designs".  Girard's  designs  are  very  similar  to  affine  expressions  in  our  language — 
expressions  in  which  continuation  variables  are  applied  at  most  once — and  we  translate  his 
result  to  that  case.  In  the  general  case,  more  observations  are  needed:  we  explain  how  to  extend 
the  language  with  integer  state,  and  show  how  this  suffices  to  distinguish  observationally  any 
two  C+  expressions  that  are  definitionally  distinct.  We  take  this  as  one  indication  that  C+  is 
indeed  the  right  framework  for  investigations  of  computational  effects  under  call-by- value. 

4.2.6  Definitional  equality 

In  order  to  reason  about  the  properties  of  composition  and  of  observational  equivalence,  we  first 
have  to  clarify  what  we  mean  by  "syntactic"  or  definitional  equality.  For  two  terms  ::  (T  F  J), 
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we  write  t\  =r\-j  t2  (or  just  t\  =  t2  when  the  form  of  judgment  is  clear  from  context)  to  indicate 
that  f  i  and  t2  are  the  same  term.  The  rules  of  definitional  equality  simply  follow  the  rules  of 
term  construction: 

P  ■■  (A  IF  A+ )  ai  =rhA  Q-2  P  ::  (A  IF  A+)  — »  Ki(p)  =r,Ab#  K2(p) 

PW  i]  =r\-A+  PW  2]  Ki  =rw+  K2 

^1  rhAi  ^2  n-A2  ^2  &  •  £  r  Vi  — rKA+  ^2 

•  =ri-  •  (cri,  o-2)  =n-(A!,A2)  «  Vi  =rb#  «  ^2 

This  notion  of  equality  has  a  more  extensional  flavor  than  typical  notions  of  definitional  equality, 
but  that  is  forced  on  us  by  the  functional  representation  of  syntax.  Implicit  in  these  rules  is  that 
we  have  notions  of  equality  for  continuation  variables  and  value  patterns:  equality  of  patterns 
is  just  equality  of  trees  (relying  on  the  unannotated  view  of  frames)  while  equality  of  variables 
is  n-equivalence  (relying  on  the  annotated  view  of  frames).  Finally,  note  that  since  we  take 
continuations  to  be  defined  by  partial- recursive  functions  from  patterns  to  expressions,  the  rule 
for  equality  implicitly  requires  that  both  continuation  maps  terminate  on  the  same  set  of  patterns. 
Moreover,  because  we  allow  terms  to  be  non-well-founded,  we  must  also  allow  non-well-founded 
equality  derivations. 

Proposition  4.2.12  (Reflexivity).  For  all  tv.  (TV-  J),  we  have  t  =  t. 

Proof.  Immediate  by  recursion  on  t.  □ 

Proposition  4.2.13  (Symmetry).  If  t\  =  t2  then  t2  =  t\. 

Proof.  Immediate  by  recursion  on  the  derivation  of  t\  =  t2.  □ 

Proposition  4.2.14  (Transitivity).  If  t±  =  t2  and  t2  =  1 3  then  t\  =  1 3. 

Proof.  Immediate  by  recursion  on  the  two  equality  derivations.  □ 


4.2.7  Identity 

The  derivations  of  the  two  identity  principles  in  §2.1.4  correspond  to  two  particular  terms.  For 
any  positive  continuation  variable  k  :  •A1  £  T,  we  build  the  identity  A-continuation  ldK  ::  (T  F 
• A+ ),  and  likewise  for  any  frame  A  £  T,  we  build  the  identity  A- substitution  Id\M  ::  (T  F  A). 
The  notation  [A]  is  used  here  to  denote  the  labels  in  a  frame,  and  similarly  we  write  [p]  for  the 
labels  in  a  pattern.  Then  we  can  build  the  identity  continuation  and  the  identity  substitution 
according  to  the  following  mutually  recursively  definitions: 

IdKp=  k  (p[W[p] ])  Id.  =  ■  W[A!,a2]  =  (^[Ai], Id[A2]) 

The  reader  can  verify  that  these  definitions  correspond  precisely  to  the  derivations  in  §2.1.4.  As 
per  Theorem  2.1.11,  these  derivations  are  not  necessarily  well-founded,  if  the  definition  ordering 
is  not  well-founded.  However,  we  can  verify  that  these  definitions  are  productive,  that  is,  we  can 
compute  the  structure  of  IdK  and  Id^-  to  arbitrary,  possibly  infinite  precision.  This  is  analogous 
to  the  situation  in  untyped  A-calculus,  where  //-expansion  is  represented  by  infinite  Bohm  trees. 
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4.2.8  Composition 

For  any  positive  value  V  ::  (r  F  A+)  and  positive  continuation  K  ::  (r  F  »A+),  we  can  build  an 
expression  A'»F  ::  (r  h  #)  corresponding  to  the  composition  (reduction)  principle.  Likewise,  for 
any  term  t  ::  (r(A)  F  J)  and  A-substitution  a,  we  can  build  a  term  t[a)  ::  (r  F  J)  corresponding 
to  the  composition  (substitution)  principle.  We  build  them  according  to  the  following  mutually 
recursive  definitions: 


K  •  p[a]  =  K(p)[a\ 

(* V)la]  =  lK’(Vl,7l)  “’W  =  lr 

\k  (V[(t])  if  k  ^  dom(er) 

(p[<T0])[cr]  =p[cr0[(r]] 

•H  =  ■ 

(cri,  cr2)[cr]  =  (cti  [<r] ,  (T2  [cr] ) 

/v[cr]  =  p  i— >  A(p)[cr] 

Here  <t(k)  refers  to  the  action  of  a  substitution  on  variables  in  its  frame,  as  defined  in  Obser¬ 
vation  4.2.3.  The  notation  p  h>  ...  stands  for  a  map  over  patterns  (in  the  above,  defining  the 
continuation  that  sends  any  p  to  the  expression  K(p)[a]).  Again,  the  reader  can  verify  that  these 
definitions  correspond  precisely  to  the  derivations  of  the  composition  principles  in  §2.1.4.  But 
note  the  derivations  for  composition  are  neither  well-founded  in  general,  nor  productive:  in 
particular,  the  definitions  of  K  •  V  and  E\a]  might  never  yield  even  the  outermost  variable  k 
of  an  expression.  For  notational  convenience,  we  reify  this  possibility  as  a  "pseudo-expression" 
fl  (pronounced  "diverge").  O  is  also  useful  for  making  (partial-recursively  defined)  continua¬ 
tion  maps  total,  letting  us  write  K (p)  =  Q  when  I\  diverges  on  pattern  p.  For  the  purpose  of 
definitional  equality,  Q  is  only  equal  to  itself. 

As  an  example  of  a  composition  that  diverges,  suppose  we  define  a  D-continuation  K  by 

K  (dk/c)  =  /c  (DK  K) 

(Recall  that  DK  K  is  syntactic  sugar  for  dk_[A'].  We  must  also  define  K  (dnn)  for  K  to  be 
exhaustive  over  D-patterns,  but  the  definition  is  irrelevant  here.)  Then  I\  •  DK  K  =  Q. 

4.2.9  Properties  of  composition 
Lemma  4.2.15  (Unit  laws). 

1.  t[Id[ a]]  =  t 

where  t  ::  (T,  Ah  J) 

2.  IdK*V  =  kV 

where  V  ::  (T  h  4)  and  k  :  »A  e  T 

3.  Id[A][a)  =  a 
where  a  ::  (T  F  A) 
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Proof.  We  derive  the  three  equalities  by  mutual  recursion.  For  equality  (1),  we  examine  the  form 
of  t.  In  most  cases  we  appeal  to  (1)  on  subterms  of  t  and  then  derive  the  equality  When  t  =  nV, 
we  have  k  V[Id\&  1]  =  IdK  •  V'  =  n  V  by  (2).  For  equality  (2),  we  know  V  is  of  the  form  p[a ]  for 
some  p  ::  (A  IF  A),  and  we  have  IdK»p[a]  =  k  (p[W[p]])H  =  k  (p[Id\p\  [a]])  =  k  ( p[a ]),  with  the  last 
step  by  (3).  Finally,  to  show  equality  (3),  we  recur  over  the  structure  of  A,  the  interesting  case 
being  when  it  is  a  singleton  k  :  »B.  Then  a  has  the  form  ( K/n)  for  some  K,  and  IdiM  has  the 
form  (. ldK/n ).  To  show  IdK[(K/n)\  and  K  are  equal,  we  must  apply  the  two  sides  to  arbitrary 
value  patterns  p'  ::  (A'  IF  B),  and  show  they  are  equal  at  T,  A'  F  #.  We  have  that 

IdK[{K/K)]{p')  =  (k  p[Id[p, ]})[(!< /k)} 

=  K  •p[(ld[p,][{K/K)])\ 

=  K.p’[Id]p,] \  n 

=  K(j/)[IdM] 

=  K(p')  (1) 

where  in  (*)  we  use  the  fact  that  n  f  \p']  =  [A7].  □ 

Lemma  4.2.16. 

1.  (K*V)[a\  =  K[a\»V[a] 

where  a  ::  (T  F  A)  and  V  ::  (T,  A  F  A)  and  K  ::  (T,  A  F  »A) 

2.  (tH)H  =  (fM)[u2H] 

where  o\  ::  (T  F  Ai)  and  02  ::  (T,  Ai  F  A2)  and  t.  ::  (T,  Ai,  A2  F  J) 

Proof.  For  (1),  we  know  V  is  of  the  form  p[<jq\,  and  we  have  that  (. K  •p[<ro])[<r]  =  (K (p)  [00] )  [u]  = 
(iT(p)[a])[(To[ij]]  =  K[a ]  •p[uo[ct]]  =  K[a ]  •  (p[<7o])[<r],  with  the  second  equality  by  (2).  For  (2),  we 
examine  the  form  of  t.  In  most  cases  we  appeal  to  (2)  on  subterms  of  t  and  derive  the  equality 
When  t  =  k  V  for  some  k  g  A2,  we  have 


((«  C)[cr2])[cri]  =  (<j2(k)  •  V[a2\)[ai\ 

=  ct2(k)[cti]  •  {V[a2])[(Ti] 

(1) 

=  o-2(k)[cti]  •  (C[cti])[o-2[cti]] 

(2) 

=  («  V[<Tl])[<T2[<Tl]] 

□ 

Proposition  4.2.17  (Commutation).  Let  o\  ::  (T  F  Ai)  and  cx2  ::  (T  F  A2)  and  t  ::  (T,  Ai,  A2  F  J). 
Then  t[(ai,cr2)]  =  t[cri][cr2]  =  t[cr2][cJi]. 

Proof.  Immediate.  □ 

Corollary  4.2.18  (Associativity).  Let  o\  ::  (T  F  Ai)  and  <j2  ::  (T,  Ai  F  A2)  and  t  ::  (T,  Ai,  A2  F  J). 
Then  (i[o-i])[cr2]  =  t[(Id[ a2]Wi)M]- 
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Proof.  Combining  the  above  facts  we  have 


t[(Id{  a2],^i)M]  =  t[(Id{A2][a2},a  i[cr2])] 
=  f[(0-2,0-l[0-2])] 

=  (*M)kiM] 

=  (*M)M 


(def.) 

(Lemma  4.2.15(3)) 
(Prop.  4.2.17) 
(Lemma  4.2.16(2)) 

□ 


4.2.10  Complex  variables 

In  Chapter  2  we  introduced  a  notion  of  complex  hypotheses,  whose  only  role  was  to  simplify  the 
presentation  of  canonical  derivations.  For  the  same  purpose,  we  can  add  them  to  C+ .  The 
idea  is  that  a  hypothesis  x  :  A  in  the  context  takes  the  place  of  explicit  quantification  over  ,4- 
patterns.  We  call  x  a  complex  value  variable.  As  a  simple  motivating  example,  suppose  for 
instance  that  we  want  to  define  the  projection  functions  on  pairs  as  continuation  transformers. 
For  arbitrary  positive  types  A  and  B,  given  continuation  variables  K\  :  »A  and  n2  :  »B,  we  define 
the  projections  by  the  following  maps  on  A  ®  /i-patterns: 

TTl  (P1,P2)  =  Kl  Ol[W[Pl]]) 

7T2  (P1,P2)  =  {P2[Id\p2]}) 

Because  the  projection  functions  work  generically  over  A  and  B,  matching  on  p\  and  p2  only  to 
reconstruct  them,  we  could  write  iri  and  7t2  more  concisely  using  complex  variables: 

7Ti  (X1,X2)  =  K  IdXl 
7T2  (X1,X2)  =  K  IdX2 

Basically  we  would  like  to  write  the  functions  using  shalloiv  pattern-matching,  rather  than 
pattern-matching  all  the  way  down.  This  is  really  just  syntactic  sugar — the  two  versions  of 
the  functions  behave  exactly  the  same  way,  as  they  would  in  a  strict  language  like  ML — but  it  is 
convenient  syntactic  sugar.  Formally,  we  include  in  jC+  only  a  single  construct  related  to  complex 
hypotheses,  which  case-analyzes  them  away: 


P  fp 

A  lb  A+  — >  T(A)  b  J 

case  s  of  (p  tp)  =  T(x  :  A+ )  b  J 

There  is  no  explicit  construct  to  introduce  complex  value  variables,  but  we  can  define  one  via 
pattern  substitution: 

Proposition  4.2.19  (Pattern  substitution).  If  t  ::  (T(x  :  A)  b  J)  and  p  ::  (A  lb  A)  then  t[p/x\  :: 
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(T(A)  I-  J),  where  t[p/x\  is  defined  as  follozvs  (t*  indicates  a  pattern-indexed  term,  t*  =  p  i-»  tp) 

(case  x  of  t*)[p/x\  =  t*(p) 

(case  y  of  t*)[p/x\  =  case  y  of  ( p  t* (p)[p/x]) 

{p'[cr])[p/x\  =  p[a[p/x}} 

■\p/x\  =  • 

{a1,a2)\p/x\  =  {ai\p/x\,a2[p/x}) 

K[p/x\  =  ( p  i— >  K(p)\p/x ]) 

( k  V)[p/x\  =  n  {V[p/x\) 

This  is  more  or  less  the  usual  notion  of  capture-avoiding  substitution,  and  importantly,  unlike 
the  composition  principles,  does  not  involve  any  "serious"  computation  (in  Reynolds'  sense). 

To  see  how  this  give  us  license  to  build  terms  using  shallow  pattern-matching,  consider  again 
the  above  definitions.  To  build  an  A  <g>  ^-continuation  in  context  T,  we  must  give  a  map  from 
patterns  p  ::  (A  lh  .4  0  B)  to  expressions  in  context  T,  A.  By  the  definition  of  ,4  0  B,  this  is 
equivalent  to  defining  a  map  from  patterns  pi  ::  (Ai  lh  A)  and  p2  (A2  lh  B)  to  expressions  in 
T,Ai,  A2.  But  by  pattern  substitution,  this  is  equivalent  to  constructing  a  single  expression  in 
context  Y,x\  :  .4.  x2  :  B.  So  the  two  versions  of  tt\  and  tt2  really  are  the  same,  with  an  implicit 
use  of  pattern  substitution  in  the  definitions  by  shallow  pattern-matching. 

Although  complex  value  variables  are  basically  syntactic  sugar,  we  must  still  check  that  they 
cohere  with  our  definitions  of  equality,  identity,  and  composition.  Two  terms  are  equal  in  a 
context  with  complex  hypotheses,  if  they  are  equal  under  all  pattern  substitutions: 

P 

Alh  A+  — >  h\p/x\  =r(A)hj  h\p/x] 
h  =r(x-.A+)bj  ^2 

Given  a  complex  value  variable  x,  we  construct  the  identity  value  Idx  by  case  analysis: 

Idx  =  case  x  of  p  i— >  p[Id^\ 

And  likewise,  to  compose  two  terms  in  a  context  with  complex  hypotheses  ( V  ::  (T(.x  :  B+)  h  A~) 
and  K  ::  (T(x  :  B+ )  h  «A+),  or  t  ::  (T(x  :  B+)( A)  h  J)  and  cr  ::  (T(x  :  B+ )  h  A)),  we  first  do  a 
case-analysis: 

K  •  V  =  case  x  of  (p  K[p/x]  •  V[p/x\) 
t[a\  =  case  x  of  (p  e- >  {t[p / x])[a[p / x]]) 

Finally,  we  can  substitute  a  value  V  =  p[u]  for  a  complex  hypothesis  x  within  a  term  t,  simply 
by  performing  the  pattern  substitution  [p/x]  followed  by  a,  i.e.,  t[V/x\  =  (t[p/x])[a]. 

4.2.11  Type  isomorphisms 

With  a  notion  of  identity  and  composition  of  terms,  we  can  give  a  definition  of  type  isomorphism, 
which  is  useful  for  formalizing  our  intuition  that  two  types  provide  the  same  information. 

Notation.  For  two  continuation  transformers  f  ::  (k  :  »B  h  »A)  and  if  ::  (k  :  »C  \-  •B),  we  write  if  of 
as  shorthand  for  (f[if/n\  ::  (k  :  »C  h  »A).  Similarly,  for  two  value  transformers  <f*  ::  (x  :  A  h  B)  and 
if*  ::  (x  :  B  b  C),  we  write  if*  o  cf*  as  shorthand  for  if*[<f* /x]  ::  (x  :  A  h  C). 
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Definition  4.2.20  (Type  isomorphism).  We  say  that  tivo  positive  types  A+  and  B  are  isomorphic 
(A  «  B)  if  there  exist  a  pair  of  continuation  transformers  cp  ::  (k  :  »B  b  •A)  mid  ip  ::  {k  :  »A  \-  »B) 
which  are  inverses,  i.e.,  such  that: 

1.  ip  o  f  =k:9a\-»a  IdK 

2.  (p  o  Ip  =K:,Bh»B  IdK 

Recall  that  in  §2.3.3  we  also  defined  a  "positive"  notion  of  entailment  A  b  B,  in  addition  to 
the  negative  notion  »B  b  »A.  We  could  do  likewise  here,  and  define  another  notion  of  type 
isomorphism  by  the  existence  of  a  pair  of  value  transformers  which  are  inverses.  However, 
because  of  the  requirement  of  isomorphism,  this  positive  notion  is  equivalent  to  the  negative  one. 

Proposition  4.2.21.  A  m  B  iff  there  exist  a  pair  of  value  transformers  <p*  ::  {x  :  A  b  B)  and  ip*  ::  (x  : 
B\~  A)  ivhich  are  inverses,  i.e.,  such  that: 

1.  lP*°  <t>*  =x:AhA  idx 

2.  f*  o  ip*  =x:B\-b  Idx 

Proof.  The  backwards  direction  is  analogous  to  Proposition  2.3.9:  we  take  cp 
ip  =  x  i — ►  n  ip*.  In  the  forwards  direction,  because  ip  o  <p  =  x  i— >  n  Idx  and  <p  o  ip 
know  (by  the  definition  of  composition)  that  <p  and  ip  must  be  of  the  form  <p 

ip  =  x  i— >  nip*  for  some  cp *  and  ip*.  Then  because  ipcnp  =  x  i— >  k  {ip*  of*)  and  (poip 

we  know  that  cp*  and  ip*  must  be  inverses. 

Proposition  4.2.22.  «  is  reflexive,  symmetric,  and  transitive. 

Proof.  Symmetry  is  immediate,  while  reflexivity  and  transitivity  are  a  consequence  of  the  unit 
laws  and  associativity  (Lemmas  4.2.15  and  4.2.18).  Explicitly,  we  have  A  m  A  for  all  A,  because 
IdK  ::  (n  :  »A  b  »A),  and  IdK  o  IdK  =  IdK.  Suppose  A  «  B  and  B  «  C.  Then  there  exist 
(pi  ::  (k  :  »B  b  »A)  and  ip\  ::  (n  :  »A  b  »B)  such  that  (p\oipi  =  IdK  and  ip\  °  (pi  =  IdK,  and  likewise 
(p2  ::  {k  :  »C  b  »B)  and  ip2  (n  :  »B  b  »C)  such  that  (pi  o  ip2  =  IdK  and  ip2  o  cp2  =  IdK-  Then 
{(pi  o  <p2)  ::  {k  :  »C  b  »A)  and  {ip2  o  ipi)  ::  {k  :  »A  b  »C),  and 

{cpl  o  (p2)  o  (ip2  o  ipi)  =  (pi  o  {(p2  o  ip2)  o  ipi 
=  fi  o  IdK  o  ipi 
=  <PlQ1pl 
—  IdK 

{lp2  O  Ipi)  O  {(pi  o  (p2)  =ip2o  {ipi  O  fi)  O  (p2 
=  ip2o  IdK  O  (p2 
=  1p2  o  (p2 

=  IdK 


=  x  i — >  n  cp*  and 
=  x  i — >  k  Idx,  we 
=  x  i — >  n  (p*  and 
=  x  i — >  k  {cp*oip*), 
□ 


Hence  Am  C.  □ 

Proposition  4.2.23  (cf.  Laurent  [2005]).  The  following  isomorphisms  of  positive  types  hold: 
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A®(B®C)z 

e,(A®B)®C 

A®Bk. 

B®  A 

A  «  A© 

A©  {B®C)  s 

s  (A©  B)  ©  C 

A©  B  « 

B®  A 

A  «  A© 

A  <g>  (B  ©  C)  «  (A  0  B)  ©  (A  ©  C) 

A®  0  «  0 

^(A®B)  «  ^A®  ^B  bo^l 


Proof.  Routine  calculations.  □ 

4.2.12  Environment  semantics 

As  noted  in  the  introduction  to  this  section,  the  composition  and  identity  principles  for  terms 
provide  an  equational  theory  comparable  to  /byconversion  for  the  A-calculus,  but  not  a  realistic 
account  of  evaluation.  We  now  describe  a  more  conventional  operational  semantics  based  on 
environments.  We  deliberately  exclude  complex  hypotheses  from  the  environment  semantics, 
because  they  do  not  play  an  interesting  computational  role. 

Definition  4.2.24  (Environments).  An  environment  7  for  context  T  is  built  as  follows: 

1  G 

r  rh a  ,.  , 

-  emp  — —  bind 

In  other  words,  an  environment  is  an  ordered  list  of  substitutions,  each  of  which  can  reference  vari¬ 
ables  bound  by  prior  substitutions.  We  use  the  notation  (7;  a)  as  an  abbreviation  for  bind(7,  a),  and 
(07; ... ;  crn)  as  an  abbreviation  for  bind(. . .  bind(bind(emp,  07),  07) . . . ,  an). 

Proposition  4.2.25.  Let  7  =  (ay; . . . ;  an)  be  a  T -environment,  for  T  =  Ai, . . . ,  An,  and  let  t  ::  (r  b  J). 
Then  {{{t[an])[an-i}) . . .  )[oi]  ::  (•  b  J). 

Proof.  By  n  applications  of  the  composition  (substitution)  principle,  as  we  can  illustrate  like  so: 

<?n  t 

Al,...,A„_i  h  A  n  Ai,  .  .  .  ,  An  b  J 

Ai, . . . ,  An_i  b  J 
o2 

ai  Ai  b  A2 _ Ai,  A2  b  J 

■  b  Ai  Ai  b  J 


□ 


Notation.  We  write  t[j\for  the  n-fold  composition  (((i[<Tn])[<rn_i]) . . .  )[<ti]  defined  in  Prop.  4.2.25. 

Proposition  4.2.25  tells  us  that  a  E-environment  7  and  an  expression  f  ::  (r  b  |)  can  be 
combined  to  build  a  closed  expression  E[y\.  But  rather  than  evaluating  E\ 7]  by  performing  this 
n-fold  composition,  we  will  define  a  small-step  operational  semantics  to  compute  it  directly. 

Proposition  4.2.26.  Given  a  T-environment  7  =  (07; . . . ;  on)  and  a  continuation  variable  k  :  *A  e  T, 
there  is  some  cy  such  that  g^k)  ::  (r  b  «A). 
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Proof.  Since  k  :  »A  £  T,  it  must  be  in  Aj  for  some  1  <  i  <  n.  Since  a j  ::  (Ai, . . . ,  A*_i  b  A*),  we 
have  (Ti(n)  ::  (Ai, . . . ,  Aj_i  h  »A)  by  definition.  Hence  afrz)  ::  (r  h  »A)  by  weakening.  □ 

Notation.  We  write  lookup(7,  k)  ::  (r  h  »A)  for  the  continuation  ai(n)  selected  by  Prop.  4.2.26,  or 
when  we  are  feeling  terse,  simply  y(n).  Explicitly,  lookup(7,  n)  is  defined  as  follozvs: 


lookup(bind(7,  a),  k) 


<t(ac)  k  £  dom(cr) 

lookup(7,  k)  k  £  dom(7) 


Definition  4.2.27  (Programs).  A  program  is  either  a  pair  (7  |  E)  of  a  T-environment  7  and  an 
expression  E  ::  (r  b  #),  or  a  triple  (7  |  K  \  V)  of  a  T-environment  7,  a  continuation  K  ::  (r  b  »A)  and 
a  value  V  ::  (r  b  A).  We  use  the  letter  P  to  range  over  programs. 

Definition  4.2.28  (Environment  semantics).  A  small-step  environment  semantics  is  a  relation  P  ^ 
P'  between  programs.  A  result  is  a  program  P  such  that  P  7b  P'  for  any  P' ,  or  Cl.  The  small-step 
semantics  induces  a  big-step  evaluation  relation  P  fj.  R  between  programs  and  residts,  defined  inductively 
as  follozvs: 

P  ^  P'  P'  J)  R  p  result 
P^R  P^P 

We  reify  the  possibility  that  a  program  diverges  by  zvriting  P  ij.  f2. 

For  clarity,  we  will  distinguish  the  language  C+  as  described  thus  far,  as  well  as  its  total  fragment, 
from  further  extensions  of  C+  with  effects. 


Definition  4.2.29  (Purity  and  totality).  We  refer  to  arbitrary  terms  of  CA  (i.e.,  possibly  non-well-founded 
derivations  involving  positive  types,  zvhere  continuations  are  defined  by  partial-recursive  functions,  and 
including  the  pseudo-expression  Tl  representing  divergence)  as  pure  terms.  We  refer  to  CA  terms  with 
continuations  defined  by  total-recursive  functions,  and  without  the  pseudo-expression  ff,  as  total  terms. 

The  small-step  environment  semantics  for  total  £+  is  given  by  a  pair  of  rules: 

(7  |  k  V)  (7  |  lookup(7,  k)  I  V)  (lookup+) 

(7  |  A'  |  p[c r]}  ^  (bind (7,  a)  \  K(jp ))  (bind/call+) 

In  pure  C+ ,  we  add  a  single  rule  for  the  pseudo-expression  Cl: 

(7  |  Cl)  ^  (7  |  Cl)  (loop) 

Clearly,  the  two  rules  lookup+  and  bind/call+  can  be  refactored  into  a  single  rule: 

(7  I  «  (p[c]))  ^  (bind(7,cr)  |  lookup(7,  k)(p))  (go+) 

which  is  sometimes  expedient  when  we  don't  care  about  the  intermediate  transition.  This  com¬ 
pound  rule  can  be  glossed  as. 

To  execute  n  ( p[a ])  in  environment  7,  first  look  up  the  continuation  K  the  environment 
associates  with  k,  then  add  a  to  the  environment,  and  finally  execute  the  expression  K(p) 
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Because  this  is  an  intrinsic  semantics — that  is,  it  is  defined  on  programs  that  are  well-typed  by 
construction — a  type  safety  theorem  in  the  standard  sense  is  unnecessary  However,  we  can  state 
stronger  versions  of  the  usual  "progress"  and  "preservation"  lemmas. 

Lemma  4.2.30  (Intrinsic  progress).  For  all  programs  P  in  pure  C ' ,  there  exists  a  P'  such  that  P  P' . 

Proof.  Immediate.  All  pure  expressions  E  ::  (r  h  #)  are  either  Fl  (we  can  invoke  loop)  or  of  the 
form  E  =  k  V,  where  k  :  d  £  T  and  V  ::  (r  b  A)  (we  can  invoke  lookup+).  Likewise,  all  pure 
values  V  ::  (r  h  A)  are  of  the  form  V  =  p[cr],  where  p  ::  (A  lb  A)  and  cr  ::  (r  b  A)  (we  can  invoke 
bind/call+).  □ 

The  intrinsic  progress  lemma  has  a  surprising  consequence:  in  pure  C+ ,  there  is  only  one  possible 
result  (IT),  and  in  total  C+  there  are  none! 

Lemma  4.2.31.  i[bind(7,  er)]  =  (i [t] ) [cr['T]] 

Proof.  By  Lemma  4.2.16.  ftf 

Lemma  4.2.32  (Intrinsic  preservation).  For  pure  C+,  if  (7  |  E)  (7  |  K  \  V)  ^  tgf  \  E'),  then 
E[1)=K.V[1]  =  E>[i). 

Proof.  The  transitions  are  by  the  lookup+  and  bind/call+  rules,  so  that  E  =  k  V,  V  =  p\a\, 
K  =  j(k),  7'  =  (7;  a),  and  E'  =  K(p).  We  have: 

(«  v)b\  =  7(«)fr]  •  ^[7] 

=  (7(«)(p)[7])k[7]] 

=  7(«)(P)[(7;^)] 


with  the  last  step  by  Lemma  4.2.31.  □ 

We  can  apply  intrinsic  preservation  to  show  that  the  environment  semantics  indeed  implements 
the  desired  behavior  of  n-fold  composition. 

Corollary  4.2.33  (Functionality).  For  pure  C+ ,  if  (7  |  E)  R  then  E[ 7]  =  R. 

4.2.13  Observational  equivalence 

The  environment  semantics  leads  to  a  natural  definition  of  observational  equivalence. 

Definition  4.2.34  (Observational  equivalence).  Let  E\ ,  E2  ::  (r  b  #)  be  two  expressions.  We  say 
that  Ei  and  E2  are  observationally  equivalent  (written  E\  =  E2)  if  for  all  T -environments  7  and  all 
results  R,  we  have  (7  |  Ef)  JJ.  R  iff  (7  |  E2)  JJ.  R. 

It  is  easy  to  see  that  definitional  equality  implies  observational  equivalence. 

Definition  4.2.35  (Equality  of  environments).  Two  T -environments  7  =  (<ti  ; . . . ;  <rn)  and  7'  = 
(o'i, . . . ;  a'n)  are  definitionally  equal  (written  7  =r  7O  if  a i  =  <r'  for  all  1  <  i  <  n. 

Proposition  4.2.36.  If  7  =r  7'  then  'y(n)  =r,Ah#  for  all  k  :  »A  e  T. 

Lemma  4.2.37  (Congruence).  If  (71  |  Ef)  JJ.  R  and  E\  =  E2  and  71  =  72  then  (72  |  E2)  JJ.  R. 
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Proof.  For  pure  C  ,  we  know  the  derivation  must  have  the  form: 

<7i  I  £i)  ^  <7i  I  E[) 

<7i  \Ei)1),R 

with  the  transition  <71  |  Ef)  (7)  \  E\ )  either  by  loop  or  decomposed  by  the  go+  compound 
rule.  In  the  former  case  we  have  E\  =  E2  =  R  =  PI,  and  the  statement  is  trivial.  In  the 
latter  case,  we  know  E\  =  k  (p[a  1]),  E[  =  71  (ft)(p),  and  7]  =  (71;  07).  Since  E\  =  E2,  we 
know  that  E2  =  n  {p[(T2})  for  some  07  =  07.  Then  (E2  |  72)  ^  <(72;  02)  |  72 {k){p))  by  go+. 
Since  (71707)  =  (72;  07)  by  definition,  and  71  {n){p)  =  72 {n)(p)  by  Proposition  4.2.36,  we  have 
(E2  |  72)  JJ-  R  by  appeal  back  to  congruence.  □ 

As  an  immediate  result  of  congruence,  we  have: 

Theorem  4.2.38.  If  E\  =  E2  then  E\  =  E2. 

The  more  interesting  question  is,  how  coarse  is  observational  equivalence?  We  already  remarked 
that  in  pure  C  1 ,  there  is  only  one  possible  result  (divergence),  so  the  relation  =  is  indeed  trivial. 
But  as  we  add  more  kinds  of  observations  to  the  language,  we  get  a  progressively  finer  notion 
of  observational  equivalence  on  pure  expressions,  until  it  eventually  coincides  with  definitional 
equality. 

4.2.14  Immediate  failure,  and  the  chronicle  representation 

We  extend  C+  with  a  single  rule: 


The  expression  15  is  pronounced  "fail".  We  give  no  additional  transition  rules,  so  that  15  is 
a  possible  result,  which  can  be  thought  of  as  representing  some  sort  of  safety  violation:  an 
uncaught  exception,  an  abort,  a  hardware  crash,  etc.  Obviously,  15  is  a  new  kind  of  result,  distinct 
from  Pi,  so  formally  we  declare  that  15  is  only  definitionally  equal  to  itself.  We  write  CVi  for  this 
extension  of  C+ ,  and  =<;  for  the  derived  observational  equivalence  relation  on  expressions  of  C\y 
Note  that  this  also  gives  us  a  new  equivalence  relation  on  pure  C+  expressions,  by  restriction. 

Logically,  the  rule  15  corresponds  to  immediately  asserting  a  contradiction  under  any  set  of 
assumptions:  a  plainly  inconsistent  reasoning  principle.  Girard  [2001]  introduced  the  same  rule, 
which  he  calls  daimon,  in  the  setting  of  ludics.  There,  he  proved  a  statement  (the  Separation  The¬ 
orem,  an  analogue  of  Boehm's  theorem  for  A-calculus  [1968])  which  essentially  says  that  Pi  and 
15  are  sufficient  effects  to  distinguish  observationally  any  two  "designs"  which  are  definitionally 
distinct.  Here,  we  show  that  this  holds  for  affine  expressions. 

Definition  4.2.39  (Affineness).  We  say  that  a  term  t  ::  (T  F  J)  is  affine  when 

1.  For  every  subterm  nV  of  t,  k  is  not  free  in  V,  and 

2.  For  every  snbterm  (07, 07)  of  t,  07  and  02  have  disjoint  free  variables. 

In  order  to  prove  this  result  about  definitional  equality  of  affine  expressions,  as  well  as  to  pave 
the  way  for  the  result  about  arbitrary  expressions  in  the  next  section,  we  introduce  (analogues 
of)  several  technical  notions  from  ludics,  starting  with  that  of  chronicle. 
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Definition  4.2.40  (Chronicles).  A  proper  action  is  a  tuple  (n,p)  of  a  continuation  variable  n  and  a 
pattern  p  of  the  same  type,  i.e.,  n  :  »A  and  p  ::  (A  II-  A)  for  some  A  and  A.  An  improper  action  is 
either  Q  or  13.  We  use  a  to  range  over  actions,  proper  or  improper.  A  chronicle  c  is  a  sequence  of  actions 
c  =  cci  •  •  •  OLn. 

Girard  defines  chronicles  near  the  start  of  "Locus  Solum",  in  order  to  construct  the  basic  proof 
objects  of  ludics  (designs)  as  special  sets  of  chronicles.  Because  we  already  have  a  working 
language  of  proofs  (C+  and  its  extension  //-),  we  shall  view  chronicles  more  as  derived  artifacts. 
Concretely,  our  definition  omits  many  of  the  side-conditions  that  are  part  of  the  definition  of 
chronicles  in  ludics — instead,  we  can  show  that  they  are  derived  properties  of  chronicles  in  the 
chronicle  representation  of  a  term. 

Notation.  We  write  (k,p,ct)  as  shorthand  for  k  ( p[a ]),  and  a  k!  p'  as  shorthand  for  a(n')(p’). 

Definition  4.2.41  (Chronicle  representation).  Let  t  be  an  expression  or  substitution  in  C-C).  We  define 
a  set  of  chronicles  |i|  (called  the  chronicle  representation  oft)  as  follozvs: 

■  e  \t\ 

n  s  \n\  u  e  \u\ 

c  £  |<r|  c  £  \a  k'  p'\ 

(k,p)  ■  c  e\(K,p,cr)\  (k',p')  •  c  £  |cr| 

If  a  -  c  £  \t\,  we  say  that  a  is  positive  if  t  is  an  expression,  or  negative  if  t  is  a  substitution.  Thus, 
the  odd-numbered  (even-numbered)  actions  of  a  chronicle  in  an  expression  (substitution)  are  positive,  and 
even-numbered  (odd-numbered)  actions  are  negative. 

We  can  see  that  some  of  the  conditions  placed  on  chronicles  in  ludics  are  obvious  properties  of 
chronicles  in  the  chronicle  representation  of  a  expression  or  substitution. 

Proposition  4.2.42  (Propriety).  For  c  £  |t|,  all  actions  before  the  last  are  proper. 

Proposition  4.2.43  (Positive /negative  actions).  Let  c  £  \t\,  where  t  is  either  an  expression  E  ::  (T  h  #) 
or  a  substitution  a  ::  (r  b  A) ).  Suppose  we  label  the  (possibly  empty)  sequence  of  proper  actions  in  c  by: 

(«i,pi)  '  («i.Pi)  '  («2,P2)  •  •  •  if  t  =  E 

(ki.pI)  '  («2,P2)  •  (4,^2) ' '  •  ift  =  a 

Then  there  is  some  collection  of  frames  A'1;  A2,  A'2, . . .  and  types  Ai,  Af  A2,  A'2, . . . ,  such  that  for  all 
proper  actions  ( Ki,pi )  and  {n'^p'f)  in  c: 

Hi  ■  •Ai  £  T,  A2, . . . ,  A i  pi\\  (A-  lh  At) 

<  ■■  *4  €  A'  p'  ::  (Aj+1  lh  A') 

We  should  remark  that  the  condition  on  chronicles  in  ludics  that  "foci"  (continuation  variables) 
be  pairwise  distinct  does  not  hold  for  chronicles  coming  from  arbitrary  terms.  By  renaming 
of  variables  we  can  assume  that  the  k[  coming  from  negative  actions  are  distinct  from  all  other 
Kj  and  Kj.  However,  the  condition  that  n,  f  Kj  for  i  f  j  requires  the  affineness  restriction.  We 
can  verify  some  additional  properties  of  chronicle  representations,  which  in  ludics  are  part  of 
the  definition  of  designs. 
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Proposition  4.2.44  (Arborescence).  Chronicle  representations  are  closed  under  restriction.  Thus  we 
can  say  that  |t|  is  generated  by  its  maximal  chronicles,  i.e.,  c  G  \t\  with  no  extension  in  |f  |. 

Proposition  4.2.45  (Coherence).  If  c,  c'  G  \t\,  then  either  one  extends  the  other,  or  they  first  differ  on 
negative  actions,  that  is,  c  =  Co  •  a  ■  t\,  c'  =  Co  •  a'  ■  C2,  with  a  f  a'  negative. 

Proposition  4.2.46  (Positivity).  The  last  action  of  a  maximal  chronicle  must  be  positive.  Moreover,  if  it 
comes  from  a  pure  term,  it  must  be  proper  or  f l.  If  it  comes  from  a  total  term,  it  must  be  proper. 

Example  4.2.47.  Consider  the  (N  <S>  N)  <g>  -iN-continuation  plus*  defined  in  Example  4.2.11.  The 
chronicle  representation  of  the  singleton  substitution  (plus*  /  k')  is  generated  by  chronicles  of  the 
form 

(«',  ((ni,n2),  k2))  ■  (n2,n i  +  n2) 

The  expression  k i  (n'[(plus* / k')})  (where  k\  is  some  -i((N  <g>  N)  <g)  -iN) -continuation  variable  in 
the  context)  has  maximal  chronicles  of  the  form 


(ki,k')  ■  (n',((n !,n2),K2))  ■  (K2,m  +n2) 


The  main  reason  we  are  interested  in  chronicle  representations  is  that  they  give  us  a  better 
handle  on  how  one  term  approximates  (or  fails  to  approximate)  another  term. 

Definition  4.2.48  (Definitional  approximation).  For  two  terms  t\,t2  ::  (T  b  J)  of  £(-,  we  say  that  t,\ 
approximates  t2  (written  t±  <  t2)  ift2  is  obtained  from  ti  by  replacing  some  expressions  Ft  by  expressions 
of  the  form  k  V,  and  some  expressions  of  the  form  kV  by  15.  In  other  zvords,  the  approximation  relation 
<  is  generated  the  same  way  as  the  equality  relation  =,  except  that  zve  take  Fl  and  15  as  least  and  greatest 
expressions,  respectively. 

Proposition  4.2.49.  t±  =  t2  iff  t\  <  t2  and  t2  <  t\ 

Definition  4.2.50.  We  impose  an  ordering  on  positive  actions: 

Fl  <  (n,p)  <  15 

We  take  distinct  proper  actions  («i,pi)  and  (n2,p2)  to  be  incomparable.  We  zvrite  a<ol  if  either  a  <  ol 
or  a  =  a',  mid  a  a1  if  neither  of  these  holds. 

Note  that  we  have  to  be  a  bit  careful  about  what  we  mean  when  we  say  that  (ni,p\)  and  (n2,p2) 
are  distinct:  in  the  appropriate  context,  either  k\  f  k2  (modulo  renaming)  or  p\  f  p2  (ignoring 
the  names  of  variables). 

Proposition  4.2.51.  If  t\  <t2  then  for  any  c  G  |£i|,  either  c  G  \t2\,  or  else  there  is  some  prefix  Co  ■  a  of 

c,  zvhere  a  is  positive  and  a  <  a',  such  that  Co  •  a'  G  |t2 1- 

Proof.  By  induction  on  the  derivation  of  c  G  |fi|.  □ 

Proposition  4.2.52.  If  t\  t2  then  there  is  some  Co  and  positive  actions  a  a',  such  that  Co  •  a  G  |fi| 

and  cq  •  a!  G  \t2\. 


88 


Proof.  By  recursion  on  t\  and  t-2-  From  the  assumption  that  t\  f  t.2,  we  must  have  one  of 
the  situations  below.  Note  that  for  pure  terms,  only  cases  (3)-(6)  are  relevant,  while  for  total 
expressions  only  (4)-(6). 

1.  ti  =  13,  t2  =  Pi:  take  Co  =  ■,  a  =  U,  a'  =  Pi 

2.  t\  =  U,  t2  =  (k,p,o):  take  Co  =  -,  a  =  U,  a'  =  (n,p) 

3.  t\  =  (n,p,o),  t2  =  PI:  take  Co  =  ■,  a  =  (n,p),  a'  =  Pi 

4.  ti  =  (k,pi,cji),  t2  =  {k2,P2,(?2)  where  ki  f  k2  or  pi  f  p2:  take  c0  =  •,  a  =  (ni,pi), 
a'  =  ( n2,P2 ) 

5.  t\  =  {k,p,  (Ji),  t2  =  (k,p,  (T2)  where  07  ji  <72:  from  07  <72,  we  obtain  Cg  and  a  ji  a'  such 

that  Cg  ■  a  €  |<7i  |  and  c'0  ■  a'  €  |<72|-  Take  Co  =  (n,p)  ■  c'0. 

6.  t\  =  a  1,  t2  =  172,  where  07  f  o2:  by  definition,  there  is  some  n!  and  7/  such  that  07  n!  p'  f 
02  k'  p' ,  and  hence  we  can  obtain  c'0  and  a  f  o'  such  that  c'0  •  a  E  \o\  n1  p'\  and  c'0  •  a'  E 
| <7 2  n!  p'\.  Take  to  =  (V,  //)  •  c'0. 

□ 


We  apply  Proposition  4.2.52  to  prove  the  main  result. 

Lemma  4.2.53.  Let  Ei,  E2  ::  (T  b  #)  be  two  affine  Cf  expressions,  and  suppose  that  E\  E2.  Then 
there  exists  an  environment  7  in  such  that  (E\  |  7)  JJ.  FS  and  (E2  |  7)  1J-  Pi. 

Proof.  For  notational  convenience,  let's  relabel  these  expressions  E*  =  E\  and  FT  =  £’2.  Then 
there  is  some  Co  •  a*  E  \E*\  and  Co  •  of  E  \E^\,  where  a*  ji  of .  By  propriety,  we  know  that  Co 
must  be  a  sequence  of  only  proper  actions  (ki,pi)  •  ( k\  ,  p\ )  ■  ■  ■  («„,  77.)  •  {n!n.p'r),  together  with 
the  appropriate  conditions  on  the  contexts  T,  A'1;  A2,  ■  ■  • ,  An+i  expressed  in  Prop.  4.2.43.  Now, 
the  idea  is  that  we  build  the  I’-environment  7  so  that  when  paired  with  E*  or  FT ,  it  "walks 
through"  these  proper  actions,  until  it  gets  to  either  a*  or  of ,  which  are  distinguishable. 

Without  loss  of  generality,  we  can  view  T  as  an  annotated  frame  Ai  =  F,  and  build  7  as  a 
closed  Ai-substitution  o\  rather  than  an  environment.  We  construct  o\  together  with  a  series  of 
closed  substitutions  a2, ... ,  cr'n+1,  such  that 

a'  ::  (•  h  A*) 

This  maintains  the  invariant  that  for  any  substitutions  07 , . . . ,  07,  where 

at::(Al!Ai)...,AUAUAihA') 

we  have  that  (af  fj2;  a'2, . . . ;  07;  cr'k+ 1)  is  a  (Ai,  A'l5  A2, . . . ,  A'fc,  Afc+i)-environment. 

We  start  by  setting 

0i  «i  Pi  = 

leaving  a'2  unspecified  for  now.  Since  (  k  1,71)  •  (  k\  .  p\ )  •  •  •  E  \E*\,  \Ef,  we  know  that  both 

E*  =  (K\,pi,al)  and  ol  k\  p\  =  E2 
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and 


Eft 


(«i,Pi,a|)  and  a] 


*4  Pi  =  Et 


for  some  a *,a\  and  some  Eft,  e\  such  that  (. «2,p2 )  ■  (i^ftp^)  '  "  £  \E2 1 >  \E\ 1-  So,  we  have  both 

(0-;  |  |  («i, pi,  o£))~>  ((o-i;^;4)  |  £|) 

and 

(ni  I  £2)  (K;o-i)  |  («i,Pi,4))  (K;aJ;4)  |  £$) 

by  go+  transitions.  Now,  if  k2  were  guaranteed  to  be  in  A2,  we  could  go  on  to  define 

cr'2  k2  P2  =  (nftpfta 3) 


However,  we  only  know  (Prop.  4.2.43)  that  k2  is  in  the  domain  of  Ai,  A2.  Thus  we  set 

(V1W2)  «2  P2  =  i^ftpftaft) 


with  the  intended  meaning  that  if  k2  is  bound  in  Ai,  we  set  a\  k2  p2,  and  if  it  is  bound  in  A2 
we  set  eg  k2  p2.  And  so  on,  for  each  1  <  i  <  n,  we  require  that 

AviPi  =  (k',p',0'+1) 

Note  that  for  this  definition  to  make  sense,  it  is  crucial  that  the  pairs  (ni,pi)  are  distinct,  because 
otherwise  we  might  give  conflicting  entries  for  the  substitution  maps.  The  affineness  restriction  is 
a  sufficient  condition  for  this,  since  it  guarantees  that  the  Ki  are  distinct.  We  are  also  leaving  many 
of  the  entries  in  the  domain  of  the  a(  unspecified,  because  they  are  not  covered  by  any  proper 
action  ( Ki,pi ):  we  can  set  these  entries  to  arbitrary  expressions  (in  the  appropriate  context),  for 
example  to  12  or  LS. 

Finally,  after  building  a[, ...  ,a'n  to  "walk  through"  the  common  chronicle  Co  in  both  E*  and 
Eft  we  arrive  at  the  actions  a*  at.  These  we  can  distinguish  by  setting 

(cr'i,...,cr'n+1)  K*  p*  =  U 


if  a*  =  ( K*,p *)  is  a  proper  action,  and  likewise 

(a'i,...,a'n+i)  Kf  pf  =  n 


if  at  =  (nftpft  is  a  distinct  proper  action.  Note  that  because  a*  a\  if  a*  is  improper  it  must 
be  O,  and  if  oft  is  improper  it  must  be  Q.  By  repeating  the  above  argument,  we  can  verify  that 


(cxi  |  Eft  ^ 
(ai  |  Eft)  * 


((ai;a|;ai)  |  Eft  ((aft,  aft  aft . . . ;  a*;  a'n+1)  \  E*+1)  ftU 
((ai;  a{;  aft  \  e\)  ((ai;  aj;  aft, . . . ;  a£ ;  a'n+l)  \  E]n+ft)  JJ-  Q 


where  a*, . . . ,  a*,  E*+1  and  a\, ...  ,aft  E\+l  are  some  subterms  of  E*  and  Eft  respectively,  and 
where  a*  €  E*+1,  at  €  □ 


Theorem  4.2.54  (Affine  separation).  For  pure  affine  expressions,  E\  =  E2  iff  E\  =<>  E->. 


Proof.  By  Theorem  4.2.38  and  Lemma  4.2.53. 


□ 
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Note  that  Lemma  4.2.53  actually  allows  us  to  make  this  statement  stronger:  for  two  arbitrary  (not 
necessarily  pure)  £j|j  expressions,  definitional  equality  coincides  with  observational  equivalence. 
But  while  this  is  true  here,  in  general  we  won't  ask  for  such  a  coincidence,  because  it  requires 
making  sure  we  have  the  right  equational  theory  for  effectful  terms.  Doing  this  for  arbitrary 
effects  lies  outside  the  scope  of  this  thesis.  Instead,  we  have  the  more  modest  goal  of  verifying 
that  definitional  equality  is  the  right  equational  theory  for  pure  terms  in  the  presence  of  arbitrary 
effects. 

As  we  highlighted  in  the  proof,  the  affine  restriction  is  crucial  for  the  result  in  £(j.  We  can 
demonstrate  this  explicitly  by  exhibiting  two  syntactically  distinct  but  observationally  indistin¬ 
guishable  expressions. 

Proposition  4.2.55.  There  exist  two  pure,  non-affine  expressions  E\ ,  E2  such  that  E\  /  E2  but  E\  =u 

e2. 

Proof.  We  define  E\  and  E2  in  the  one-variable  context  k  :  «-il: 

Ei  =  «  (_[(()  ~  n)]) 

E2  =  K  (_[(()  -  E1)}) 

To  gloss  these  expressions  in  words,  E\  calls  k  with  a  1 -continuation  (treated  as  a  -T-value)  that 
when  invoked  immediately  diverges,  while  E2  calls  k  with  a  1 -continuation  that  when  invoked 
executes  E\.  The  chronicle  representation  of  E\  is  generated  by  a  single  maximal  chronicle 
(picking  an  arbitrary  name  for  the  continuation  supplied  by  the  value  _[(()  i-»  Q)]): 

(«,«')  •  («',())  •  ^ 


and  that  of  Eo  by  the  following: 

(«, «')  ■  («',  0)  ■  («, «")  ■  («",  0)  ■  n 

Note  that  E\  f  E2  (although  E\  <  E2).  Also  note  that  the  two  actions  (k,  n')  and  (k.  k")  in  \E2\ 
are  two  occurrences  of  the  same  action,  since  the  pattern  components  are  compared  ignoring 
variable  names. 

Now,  consider  any  possible  environment  7  for  k  :  «-il.  In  £J  there  are  just  three  possibilities: 

1.  7  k  n!  =  D 

2.  7/ck'  =  («,,(),-) 

3.  7  K  k1  =  13 

In  situations  (1)  and  (3),  it  is  obvious  that  (7  |  E\ )  and  (7  |  E2)  yield  the  same  result  (Q  and  (), 
respectively).  In  case  (2),  we  have  that 

(7 1 -Ei)  ((7;  (0  I  («',(),■)) 

^  ((7;  (0  ^  •)  I n) 

jj.  n 
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but  also 


<7  I  E2)  <(7;  (0  l—>  E\/k'))  I  (k,  (),  •)) 

-  <(7;  (0  ^  Ei/k')\ •)  I  El) 

((7;  (0  ^  Ex/k')\ •;  (()  Q/k"))  I  («",  0,  •)) 

-  <(7;  (0  ^i/O;  •;  (0  ^/k");  ■)  I  ft) 

JJ.  57 


Hence  E\  £'2-  □ 

On  the  other  hand,  conceptually  the  argument  in  Lemma  4.2.53  did  not  really  require  very  much. 
Any  two  distinct  expressions  E*  f  EA  must  contain  respective  chronicles  Co  •  a*  f  c(l  •  of ,  and  so 
all  we  needed  were: 

1.  Two  distinct  results  to  distinguish  the  last  actions  a*  and  a t 

2.  An  environment  that  crawls  through  the  common  prefix  Co 

For  (1),  we  used  the  two  distinct  results  available  in  dry  divergence  and  immediate  failure. 
For  (2),  we  remained  within  the  total  fragment  of  C+ ,  but  relied  on  the  affineness  restriction, 
which  ensured  that  all  proper  positive  actions  {ni,Pi)  and  ( Kj,pj )  were  distinct,  and  hence 
could  be  given  unique,  well-defined  entries  (respectively  {nfpf  cF+1)  and  ( k'- . p'- .  o'-  (1 ) )  in  the 
environment. 

So  where  does  that  leave  us?  To  build  the  chronicle-crawling  environment  in  the  general 
case,  the  idea  is  that  we  just  include  an  additional  piece  of  state  as  a  counter,  updated  during  the 
course  of  evaluation.  By  querying  this  counter,  the  environment  can  return  different  expressions 
in  response  to  distinct  occurrences  of  the  same  positive  action. 


4.2.15  Ground  state,  and  the  separation  theorem 


To  encode  a  counter,  we  will  use  a  slightly  more  general  effect:  ground  (integer)  state.  We  extend 
C+  with  a  pair  of  operations  for  building  expressions: 


N 


->  rh# 
TF# 


read 


N  rh# 

TF# 


write 


The  operational  intuition  here  is  standard:  read (£,:)*eN  reads  the  value  1  of  an  integer  variable 
and  then  executes  E.u  while  write(j,  E)  writes  the  value  j  to  the  variable  before  executing  E.  To 
make  this  intuition  precise,  we  first  have  to  extend  our  notion  of  environment. 

Definition  4.2.56  (Stateful  environments).  A  stateful  environment  is  an  ordinary  environment  7 
paired  with  an  integer  j,  which  we  write  st(#, 7).  The  bind  operation  is  extended  to  stateful  environ¬ 
ments  by  the  equation  bind(st(j,  7),  a)  =  st(j,  bind(7,  a)),  and  the  lookup  operation  by  the  equation 
lookup(st(j,  7),  k)  =  lookup(7,  re). 


With  this  updated  definition  of  environments  and  of  the  environment  operations,  we  only  need 
to  add  two  rules  to  define  our  small-step  semantics  for  state: 


(st(*,7)  I  read(£i)ieN) 
(st(#  7)  |  writ e(j,E)) 


<st(i,  7)  |  Ei) 
(st (j,  7)  I  E) 
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(read,) 

(writej) 


We  call  this  extension  and  write  =j>/ w  for  the  derived  observational  equivalence  relation. 

Of  course,  we  have  to  verify  that  doesn't  break  the  type  safety  and  congruence  properties 

of  Cf,  but  this  is  more  or  less  immediate,  because  the  new  versions  of  bind  and  lookup  preserve 
all  the  properties  of  the  old  versions,  and  the  read,  and  write.,  rules  are  trivial.  The  reader  may 
notice  that  we  have  not  fixed  a  notion  of  definitional  equality  for  read  and  write  expressions.  In 
this  case,  we  could  try  to  come  up  with  a  sensible  definition.  However,  as  I  explained  above, 
in  general  the  question  of  axiomatizing  equality  for  effectful  terms  lies  outside  the  scope  of 
this  thesis.  Because  adding  read  and  write  to  C ^  creates  no  new  results,  we  have  already  fully 
determined  the  observational  equivalence  relation  =yr/w,  and  can  thus  already  compare  it  to 
definitional  equality  on  pure  terms  of  C+. 

Theorem  4.2.57  (Separation).  Let  E\ .  E2  be  arbitrary  pure  expressions.  Then  E\  =  E2  iff  E \  =<5r/w  E2. 

Proof.  In  the  forward  direction,  we  apply  Theorem  4.2.38.  In  the  backward  direction,  to  distin¬ 
guish  E\  f  E'2  with  some  stateful  environment,  we  extend  the  construction  of  Lemma  4.2.53  to 
read  the  index  of  the  current  positive  action  as  input.  Recall,  before  we  required  that 

(oq,...,cr')  k iPi  =  (k',p',ct'+1) 

for  all  proper  positive  actions  {ni,pi),  1  <  i  <  n  in  the  common  prefix  Co,  as  well  as  that 

(a[,...,a'n+1)  n*  p*  =  U 

and 

(o-i,...X+1)  P]  =  n 

if  the  distinct  final  actions  were  proper,  a*  =  (K*,p*),at  =  («t  ,p').  We  now  update  these 
conditions: 


Oi ,...,cr'i)  KiPi  i  =  (i  +  l,rc',p',cr'+1) 

K,...,<+1)  n*  p*  (n+1)  =V 

{o’l,...,cr'n+1)  p]  {n+  1)  =Ll 

where  a  clause  (j,  k, p.  a)  stands  for  the  expression  write(j,  k  {p[a]))f  and  a  clause  a  n  p  ;j  =  E  has 
the  intended  reading  that  a(n)(p )  =  read(£,j)ig^  for  some  collection  of  expressions  (Ef,^  such 
that  Ej  =  E  (and  is  otherwise  arbitrary).  The  reader  can  verify  that  this  is  a  legitimate  definition 
of  the  a[,  irrespective  of  whether  the  expressions  are  affine,  because  the  counter  ensures  there 
are  no  conflicting  entries.  We  then  execute  E\  and  E2  in  the  environment  st(l,cr/1).  The  same 
argument  as  in  Lemma  4.2.53  shows  that  E\  evaluates  to  15  and  E2  to  Ll.  □ 


Example  4.2.58.  Let  E\  and  E2  be  as  in  the  proof  of  Proposition  4.2.55.  Recall  that  E\  contains 
maximal  chronicle 


(«,«')  •  («',())  •  ^ 


and  E2  contains 


(«,  «0  ■  («'.  0)  ■  («.  «")  ■  («"»  0)  ■ n 
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Then  we  can  distinguish  E\  and  E2  in  the  environment  st(l,  a[)  and  initial  state  1,  where 

<r[  k  k1  1  =  (2,  «',(),■) 

<7^  k  k!  “2  =  15 

We  have  that 

(st(l,0i)  |  .Ei)  (st(l,  bind (cr^ ,  (()  i— >  17/ k')))  |  (2,  n1 ,  (),  •)) 

~>-  (st(2,  bind(bind(cr/1,  (()  i— >  fi/re7)),  •))  |  12) 

U-  12 

<st(l,  cri)  |  E2)  (st(l,  bind(a/1,  (()  Ei/n')))  |  (2, «',(),-)) 

(st(2,  bind(bind(cr/1,  (()  Ei/k')),  •))  |  Ei) 

(st(2,  bind(bind(bind(cr/1,  (()  i— >  Ei/k')),  •),  (()  ►  12 / k")))  \  U) 

4  u 

And  so  Ei  and  E2  are  observationally  distinct  in  Ejr^w.  ■ 


4.3  C:  A  language  with  mixed  evaluation  order 

Now  that  we  have  defined  C+  and  explored  it  at  some  length,  we  should  have  no  problem 
generalizing  to  the  language  C,  which  is  the  type-free  notation  for  arbitrary  canonical  deriva¬ 
tions  mixing  positive  and  negative  types.  For  simplicity,  we  still  omit  atomic  types,  though  the 
interested  reader  should  have  no  trouble  reconstructing  them  from  the  development  in  §2.3.2. 


4.3.1  Continuation  patterns  and  value  frames 

We  begin  by  describing  how  to  define  negative  types  by  their  continuation  patterns.  Unannotated 
frames  A  can  now  contain  negative  value  holes  A~,  while  annotated  frames  can  contain  value 
variables  x  :  A~ . 

Recall  the  rules  from  §2.2  for  building  refutation  patterns: 


A~  IF  < 
Ai  IF 


=■  A 

>A  A2  IF  %B 


•  IF  m± 
A  IF  »A 


Ai,A2  IF 
A  IF  »B 


>A>$B 


A  IF  *A&B 

Now  we  assign  labels  to  the  rules: 


A  IF  »A8<B  (no  rule  for  T) 


IF  «±  u 

A  IF  »A 
A  IF  •A&B 


A-  IF  A 

Ai  IF  »A  A2  IF  »B 
Ai,  A2  IF  •A^B 
A  IF  mB 


fst; 


A  IF  *A&B 


copair 

snd; 


94 


These  labels  give  us  a  type-free  notation  for  (refutation/continuation)  patterns.  For  example, 
copair(_,  fst;_)  and  copair(_,  snd;_)  are  two  ^  B& =>  C')-patterns  with  frames  A.  B  and  A,  C, 

respectively. 

Again,  we  can  also  rewrite  the  rule  for  ^  with  an  explicit  variable  name: 


x  :  A'  IF  A 

Then  the  above  patterns  could  be  written  as  copair(xi,  fst;x2)  and  copair(xi,  snd;x2). 

We  use  the  letter  d  to  range  over  continuation  patterns.  Like  we  did  with  value  pattern  con¬ 
structors,  we  associate  unary  continuation  pattern  constructors  to  the  right,  so  that  for  example 
fst;snd;x  is  shorthand  for  fst;(snd;x). 

Again,  we  don't  attach  importance  to  this  particular  collection  of  continuation  patterns.  As 
one  interesting  example  of  a  negative  datatype  outside  the  propositional  fragment,  we  can  intro¬ 
duce  the  type  of  streams  of  A's  (where  A  is  negative),  with  the  following  continuation  patterns: 

A  IF  »A  ,,  A  IF  »SA  , 

A  IF  ’  A  IF  .<SA  ’ 

Continuation  patterns  do  not  have  direct  counterparts  in  mainstream  functional  languages,  so  it 
takes  some  time  to  get  an  intuition  for  them — but  the  idea  is  that  they  axiomatize  the  possible 
observations  on  values  of  negative  type.  To  observe  a  stream  we  can  either  observe  its  head  or 
observe  its  tail,  for  instance. 

4.3.2  Terms 

Given  a  collection  of  value  and  continuation  patterns,  we  construct  the  language  C  by  including 
all  the  term-forming  rules  of  C+ ,  and  adding  a  few  more  to  deal  with  negative  types. 

The  rules  for  building  negative  values  and  continuations  are  precisely  dual  to  the  rules  for 
building  their  positive  counterparts.  Accordingly,  we  can  come  up  with  some  new  slogans.  Let's 
examine  the  rule  for  building  negative  values: 

d  Ed 

A  IF  .A-  — *  T,AF# 

T  F  .4  - 


The  slogan  behind  this  rule  is  that 

a  negative  value  is  a  map  from  continuations  patterns  to  expressions 

or  in  a  slightly  punchier  version, 

a  negative  value  is  a  map  from  observations  to  actions 

The  operational  intuition  behind  this  slogan  is  that  negative  values  are  computed  on  demand,  as 
the  environment  makes  different  observations.  Consider,  for  example  the  type  <S_L.  Since  T  has 
only  one  observation  [],  observations  on  S±  are  of  the  form  hd; [],  tl; hd; [],  tl;tl;hd; [],  etc.,  which  we 
can  gloss  as  "get  the  first  element  and  then  stop",  "get  the  second  element  and  then  stop",  "get 
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the  third  element  and  then  stop",  etc.  We  build  a  5_L-value  V  by  specifying  which  expression 
to  execute  for  each  of  these  continuation  patterns: 


VM-[]  =  E0 
V  tl;hd;[]  =  E1 
V  tl;tl;hd;[]  =  E2 


Incidentally,  the  reader  may  notice  a  resemblance  between  5 _L -values  and  N-continuations.  As 
we  did  with  positive  continuations,  we  allow  negative  values  to  be  defined  by  partial  recursive 
maps,  reifying  the  possibility  that  V ( d )  diverges  as  V  ( d )  =  0. 

We  construct  negative  continuations  as  follows: 

d  a 

A  IF  mAr  T  h  A 

r  h  «a- 


writing  the  result  as  d[a\.  Thus  we  say  that 

a  negative  continuation  is  a  continuation  pattern  under  a  substitution 

Again,  this  is  a  somewhat  unfamiliar  interpretation  of  continuations:  the  "rest  of  the  computa¬ 
tion"  for  a  negative  value  can  always  be  put  into  a  normal  form  that  can  be  pattern-matched 
against.3  For  example,  by  this  interpretation,  there  can  never  be  any  T-continuations  even  in 
the  presence  of  non-termination  and  effects — contrast  this  with  the  two  different  1 -continuations 
we  can  build  using  Q  and  O.  Again,  it  takes  some  time  to  build  an  intuition  here.  As  we  will 
see  below,  the  really  interesting  negative  continuations  arise  for  types  that  mix  negative  with 
positive  polarity. 

Substitutions  in  £  are  constructed  in  the  same  way  as  in  £ ' ,  except  that  they  can  contain 
negative  values  as  well  as  positive  continuations.  We  write  cr{x)  for  the  action  of  a  A-substitution 
on  a  value  variable  in  A.  Finally,  a  value  variable  can  be  used  by  pairing  it  with  a  negative 
continuation: 

K 

x:  A-  6  r  r  F 
rh# 

which  we  write  as  x  K.  Operationally,  x  K  is  interpreted  as  passing  the  continuation  K  to  the 
value  denoted  by  x  at  runtime,  a  convention  underlying  control  operators  such  as  callcc,  as  well 
as  the  call-by-name  CPS  transform. 

4.3.3  Mixed  polarity  types 

The  most  interesting  features  of  £  arise  from  mixing  positive  and  negative  polarity.  At  the  heart 
of  this  interaction  are  the  shift  connectives  |  and  j.  The  patterns  for  the  shifts  are  trivial,  and 
we  give  them  in  their  variable-annotated  versions: 

3In  the  literature  on  call-by-value/ call-by-name  duality,  these  inductively  constructed  continuations  are  sometimes 
called  "covalues"  [Wadler,  2003]. 
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x  :  A  lb  [A  k  :  »A+  IF  •  ]A 

Intuitively,  the  value  pattern  rule  for  j  says  that  we  cannot  decompose  a  negative  value  into  more 
basic  parts — we  can  only  bind  it  to  a  variable.  The  rule  for  |  expresses  the  dual  principle,  that 
we  cannot  decompose  a  positive  continuation. 

For  example,  the  negative  type  <S  ( |  B )  can  be  thought  of  as  representing  infinite  boolean  se¬ 
quences  (sometimes  called  the  Cantor  space).  Its  continuation  patterns  are  hd;K,  tl:hd;K,  tl;tl;hd;re, 
etc.,  which  all  bind  a  single  continuation  variable  k  :  «B.  Abstractly,  n  stands  for  what  to  do 
with  the  nth  boolean  in  the  stream.  Thus,  we  can  represent  an  infinite,  alternating  sequence  of 
Os  and  Is  as  a  5  (jB) -value,  by  the  following  definition: 


V  hd;K  = 

k  TT 

V  tl;hd;K  = 

k  FF 

V  tl;tl;hd;«  = 

k  TT 

V  tl;tl;tl;hd;/«  = 

k  FF 

Of  course,  in  the  presence  of  effects,  this  type  contains  more  than  just  the  Cantor  space.  For 
example,  rather  than  passing  a  value  to  the  continuation  variable,  V  could  diverge  (11)  or  abort 
(<’>)• 

The  positive  type  <8>  j|N  can  be  thought  of  as  representing  a  pair  of  suspended  N- 
computations.  It  has  a  single  pattern,  (xi,x2),  which  binds  a  pair  of  variables  x\  :  |N,  x2  :  |N. 
For  instance, 

(_,  _)  [(k  K  Z,  K  fi)] 

is  a  positive  value  containing  a  pair  of  suspended  computations:  when  invoked  with  a  N- 
continuation,  the  first  computation  always  returns  zero,  the  second  diverges. 

The  shifts  also  let  us  define  some  interesting  new  datatypes.  For  example,  we  can  define  the 
type  of  lazy  lists  as  the  negative  type  (L'A  =  ]  (LA,  where  the  positive  type  (LA  is  defined  by 
the  following  value  patterns: 

A!  IF  A+  A2  IF  I (L'A 
■  IF  (LA  ni  Ai,A2IF  (LA  C°nS 


Note  that  all  (LA  patterns  have  the  form  nil  or  cons (p,x),  because  the  second  premise  of  the 
cons  rule  can  only  be  satisfied  by  the  trivial  pattern  for  j  (setting  A2  =  x  :  (L'A).  It  is  important 
to  realize  that  lazy  lists  (ubiquitous  in  Haskell  and  encodable  in  ML)  are  not  the  same  thing 
as  the  streams  we  defined  above,  having  very  different  patterns.  One  dimension  where  they 
differ,  for  instance,  is  that  values  of  type  (L'A  can  represent  lists  of  A's  of  finite  length,  whereas 
values  of  type  <SjL4  (note  the  polarity  shift  because  A  is  positive)  can  only  represent  infinite  lists. 
More  generally,  we  can  view  Haskell-style  lazy  sums  as  represented  by  negative  types  of  the  form 
T(|Fl  ©  j-B). 

In  addition  to  the  shifts,  we  give  patterns  for  the  connectives  — ►  and  — : 


Ai  IF  A+  A2  IF  »B~ 
Ai,  A2  IF  »A  -►  B 


Ai  IF  A+  A2  IF  mB~ 
Ai,A2  IF  A-B 


coapp 


The  app  rule  expresses  the  following  principle  about  the  mode  of  use  of  a  function  type: 
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To  use  a  value  of  type  A  — ►  B,  we  must  provide  an  argument  A,  and  use  the  result  B 

Continuation  patterns  for  A  — >  B  thus  take  the  form  app (p,d),  where  p  is  an  A-value  pattern, 
and  d  is  a  £> -continuation  pattern.  Because  app  patterns  are  common,  we  use  the  shorthand 
p@d  =  app(p,  d).  We  treat  @  as  right  associative,  so  that  we  can  apply  the  Currying  convention: 

Observation  4.3.1.  A  function  of  type  (A\  tg> . . .  <g>  An)  — >  B  (with  the  Ai  positive  and  B  negative)  is 
essentially  the  same  as  a  function  of  type  A±  — ►  •  •  •  — ►  An  — >  B,  the  former  having  continuation  patterns 
of  shape  (pi, . . .  ,pn)@d,  the  latter  of  shape  p±@  . . .  @pn@d.  (The  two  types  are  isomorphic,  in  the  sense 
of  Definition  4.3.6  below.) 


Example  4.3.2.  Although  in  the  traditional  setting  of  A-calculus  we  build  functions  using  the 
Xx.(-)  construct,  in  practical  functional  languages  like  ML  and  Haskell  we  invariably  use  pattern¬ 
matching.  It  is  worthwhile  to  see  how  definition  of  functions  by  pattern-matching  emerges 
naturally  in  £  from  the  definition  of  the  function  type  by  its  continuation  patterns. 

Consider  the  continuation  transformer  andK  from  Example  4.2.9.  We  defined  it  is  a  positive 
continuation  accepting  type  B  ®  B,  assuming  n  is  a  continuation  variable  accepting  B.  But  now 
we  can  alternatively  define  it  as  negative  value  of  type  (B  ®  B)  — »  |B: 


and  (tt,tt)@Ac  =  k  TT 
and  (tt,  ff)@K  =  k  FF 
and  (ff,  tt)@Ac  =  k  FF 
and  (ff,  ff)@K  =  k  FF 


or  as  a  Curried  function  of  type  B  — >  B  — >  |B: 

and  tt@tt@K 
and  tt@ff@K 
and  ff@tt@K 
and  ff@ff@K 


k  TT 
k  FF 
k  FF 
k  FF 


Modulo  the  shift,  these  types  look  more  like  the  conventional  types  of  and.  And  modulo 
the  continuation-passing,  these  definitions  look  just  like  its  traditional  definition  by  pattern¬ 
matching.  There  is  a  conceptual  shift,  though,  in  that  we  are  defining  the  function  by  matching 
against  its  observations,  rather  than  on  its  arguments.  This  does  not  make  much  difference  syn¬ 
tactically,  however,  precisely  because  the  boolean  arguments  are  part  of  the  observation,  and  the 
observation  on  the  boolean  result  must  be  treated  abstractly  as  a  continuation  variable.  ■ 

As  we  said  before,  we  treat  the  collection  of  types  in  £  as  open-ended,  and  we  could  go  on  to 
consider  many  more  interesting  types  by  defining  new  patterns.  Indeed,  we  should  do  this  if  we 
are  using  £  as  a  real  programming  language,  as  opposed  to  just  studying  its  high-level  features. 
When  we  program  in  languages  like  ML  and  Haskell,  we  often  define  new  datatypes  to  solve 
new  problems,  rather  than  trying  to  encode  them  with  a  fixed  set  of  existing  datatypes.  Even 
if  two  types  are  structurally  equivalent,  it  can  be  useful  to  maintain  a  conceptual  distinction 
between  them.  For  example,  in  a  program  manipulating  red-black  trees,  we  might  introduce  a 
type  color  containing  the  two  patterns  red  and  black.  Although  color  is  isomorphic  to  B  and  we 
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can  go  back  and  forth  between  them,  having  these  domain-specific  tags  is  a  helpful  mnemonic 
device.  If  instead  we  always  had  to  write  tt  and  ft  where  we  meant  red  and  black,  we  would  no 
doubt  end  up  wasting  too  much  time  trying  to  remember  which  was  which. 

The  virtue  of  our  pattern-based  formulation  of  C  is  that  the  precise  collection  of  patterns 
doesn't  matter  for  the  high-level  properties  of  the  language.  Thus  there  is  no  cost  for  defining 
new  types  when  we  program  in  C.  That  said,  in  this  thesis  we  are  mainly  interested  in  studying 
£'s  high-level  features,  and  its  relationship  to  traditional  foundational  languages  like  the  A- 
calculus.  Indeed,  it  is  worthwhile  to  take  a  brief  look  at  the  other  end  of  the  spectrum,  where 
instead  of  building  up  a  rich  collection  of  types,  we  suffice  with  two. 


4.3.4  Untyped,  or  "uni-typed"?  Better:  bi-typed! 

An  age-old  source  of  rancor  in  programming  languages  is  the  status  of  the  untyped  A-calculus 
relative  to  its  typed  cousins.  On  the  one  hand,  as  Church  showed,  untyped  A-calculus  is  a 
computationally  universal  language  capable  of  representing  arbitrary  partial  recursive  functions, 
whereas  simply- typed  (or  polymorphic)  A-calculus  is  not  (since  all  programs  terminate).  On  the 
other  hand,  after  the  addition  of  recursive  types  it  becomes  a  simple  exercise  to  encode  arbitrary 
untyped  programs  by  means  of  a  single  "universal  type"  U  =  pX.(X  — »■  X).  Thus  it  is  sometimes 
said  that  the  untyped  A-calculus  is  really  uni-typed,  i.e.,  it  is  a  special  case  of  typed  programming 
with  only  one  type.4  But  on  the  third  hand,  untyped  A-calculus  programs  aren't  literally  the 
same  thing  as  programs  with  the  recursive  type  U.  To  translate  an  arbitrary  untyped  term  into 
typed  A-calculus,  at  various  places  in  the  term  we  may  have  to  insert  coercions  transforming 
subterms  of  type  U  to  type  U  — >  U,  and  vice  versa.  For  example,  to  encode  the  untyped  term 
( Xx.x  x)  (Xx.x  x )  we  write 

(A x.(c  x)  x)  d(Xx.(c  x)  x) 

where  c  is  the  coercion  from  U  to  U  — ►  U  and  d  is  the  reverse  coercion.  It  could  be  argued  that 
these  coercions  don't  amount  to  much,  or  alternatively  that  making  them  explicit  clarifies  the 
original  program's  sense. . .  but  in  any  case  they're  there  in  the  typed  term,  where  they  aren't  in 
the  untyped  term,  and  aren't  needed. 

The  point  of  these  shallow  remarks  is  just  to  motivate  the  observation  that  in  C,  the  status 
of  untyped  programs  is  significantly  easier  to  understand,  because  the  language  definition  is 
already  generic  with  respect  to  types.  Imagine,  as  a  first  step,  building  an  untyped  variant 
of  C+ .  We  begin  by  defining  contexts  that  merely  bind  some  continuation  variables,  without 
specifying  their  type: 

A  ::=  k  |  •  |  (Ai,  A2) 


Then  we  define  untyped  value  patterns  as  derivations  of  A  IF  +,  which  is  defined  just  as  we 
defined  A  IF  A+  but  without  any  types,  e.g.,  with  rules  like  the  following: 


k  IF 


IF  + 


0 


Ai  IF  +  A2  IF  + 
Ai,A2IF  + 


pair 


A  IF  + 
A  IF  + 


ini 


A  IF 
A  IF 


inr 


Finally,  we  use  these  untyped  patterns  to  define  the  untyped  terms  of  our  language,  basically  as 
we  did  in  the  typed  case.  For  example,  if  we  forget  types,  the  rule  of  refutation, 

4See,  e.g.,  [Harper,  Chapter  21,  "The  Untyped  A-calculus"]. 
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p  EP 

a  ih  +  —4  r,Ah# 

rb.+ 

builds  a  continuation  K  =  p  i— >  Ep,  given  a  map  from  untyped  value  patterns  to  expressions.5 

Now,  what  we  have  to  notice  here  is  that  really  we  aren't  defining  a  new  language — all  we  are 
doing  is  instantiating  £+  with  a  different  set  of  patterns,  which  coincidentally  all  belong  to  the 
same  type.  We  might  even  call  this  type  "+".  Concretely,  we  do  not  need  to  define  a  new  opera¬ 
tional  semantics  for  the  untyped  language — it  is  simply  an  instance  of  the  typed  semantics — and 
all  the  results  of  §4.2.5  (such  as  the  Separation  Theorem)  continue  to  hold  without  need  for  new 
proof.  Thus,  different  from  the  situation  we  saw  above  for  A-calculus,  untyped  £+  programs  are 
literally  typed  £+  programs  with  the  single  type  +.  If  we  continue  to  play  this  game  for  all  of 
£,  we  see  that  "untyped  £"  is  really  just  the  special  case  where  all  value  patterns  introduce  the 
positive  type  +,  and  all  continuation  patterns  the  negative  type  -  (hence  "bi-typed"),  without 
any  need  for  defining  a  new  language. 

What  is  going  on  here?  Well  in  fairness  to  typed  A-calculus,  we  should  point  out  that  the 
reason  additional  coercions  aren't  necessary  for  type-checking  untyped  £+/£  terms  is  because 
they  are  already  there.  For  example,  as  noted  in  Propositions  4.2.7  and  4.2.8,  in  typed  £+  there 
are  explicit  coercions  transforming  ^-continuations  into  ^.4-values  (replace  K  by  _[K]),  and  A- 
values  into  -^-continuations  (replace  V  by  k  h- >  k  V).  In  untyped  £+,  these  coercions  don't 
have  any  action  on  types,  but  they  must  still  be  explicit,  because  values  and  continuations  are 
different  syntactic  categories. 

So  have  we  really  gained  any  footing  in  the  debate  over  A-calculus,  typed  vs.  un-?  In  a  sense, 
our  pattern-based  language  definition  argues  for  principles  that  should  please  both  sides.6  On  the 
one  hand,  we  are  reiterating  the  position  that  coercions  really  are  necessary  for  understanding 
the  computational  meaning  of  untyped  A-terms,  and  force  this  into  the  syntax  by  maintaining 
separate  syntactic  categories  of  positive/negative  values  and  continuations.  But  on  the  other 
hand,  we  are  also  supporting  the  position  that  computation  can  be  understood  before  the  level 
of  types,  in  the  sense  that  it  can  be  defined  generically  for  all  types  of  a  given  polarity. 

4.3.5  £  equality,  semantics  and  effects:  overview 

In  the  following  sections  we  complete  the  definition  of  £  by  defining  its  semantics,  both  equa- 
tionally  (definitional  equality,  the  identity  and  composition  principles),  and  operationally  (the 
environment  semantics).  Since  we  already  explored  these  concepts  at  length  for  £+,  here  we 
proceed  rapidly,  only  describing  the  additional  (negative  polarity)  cases  needed  to  extend  the 
definitions  to  £,  and  confirming  that  the  main  properties  of  the  equational  and  operational 
theories  still  hold. 

4.3.6  Definitional  equality 

For  the  positive  fragment,  definitional  equality  of  £-terms  is  defined  just  as  for  £+-terms.  We 
include  the  following  additional  rules  for  dealing  with  the  new  terms  in  the  language: 

5Because  there  are  infinitely  many  untyped  patterns  and  we  may  only  want  to  consider  a  subset  when  defining  a 
continuation,  it  becomes  even  more  important  to  have  the  expression  15,  which  we  can  use  as  a  default  entry. 

6Or  alternatively  displease  both,  depending  on  the  result  of  the  glass  half-full  vs.  half-empty  debate. 
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d  ::  (A  lb  »A  )  o\  =ri-A  cr2 
d[?  1]  =n-.A-  d[a2] 


d::(A\L*A~)  — >  Vi(d)  =r Ah#  V2(d) 

V]  =r\-A-  V2 


x  :  A  E  r  Ki  — rh»A~  -^2 
x  K\  =rh#  z  K2 

We  read  these  rules  with  conventions  analogous  to  those  for  the  positive  fragment: 

•  Equality  of  continuations  patterns  is  implicitly  defined  as  equality  of  trees  (ignoring  names 
of  variables),  and  equality  of  value  variables  is  a-equivalence. 

•  The  rule  for  equality  of  values  implicitly  requires  that  both  value  maps  terminate  on  the 
same  set  of  continuation  patterns. 

•  We  allow  non-well-founded  derivations  of  equality 
Proposition  4.3.3.  £  equality  is  reflexive,  symmetric,  and  transitive. 

4.3.7  Identity 

For  any  negative  value  variable  x  :  .4  E  T,  we  define  the  identity  A-value  Idx  ::  (Tb  A~)  by  the 
action  Idx  d  =  x  (d[Id^]). 

4.3.8  Composition 

For  any  negative  value  V  ::  (T  b  A~)  and  negative  continuation  K  ::  (T  b  »A~),  we  define  the 
composite  expression  V  •  K  ::  (T  b  #)  by 

V  •  d[ a]  =  1 4(d)  [a] 

For  any  £  term  t  ::  (T(A)  b  J)  and  A-substitution  a,  we  build  the  term  t[a]  ::  (T  b  J)  by 
augmenting  the  definition  from  §4.2.8  with  the  following  extra  cases: 

d[a0][a]  =  d[cr0[a}\ 

V[a]  =di->  C(d)[cj] 

(xKM  =  {v’Klai  “<,w  =  v 
\x  (K  [cr])  if  x  dom(cr) 

4.3.9  Properties  of  composition 

Proposition  4.3.4  (Unit  laws).  The  three  unit  lazvs  of  Lemma  4.2.15  continue  to  hold  in  £,  and  moreover: 

4.  Idx  •  K  =  x  K 

where  I\  ::  (Tb  »A)  and  x  :  A  E  T 

Proposition  4.3.5  (Associativity).  The  associativity  equation  ('(^ [cri] ) [cr2]  =  f[(Id[A2],  vi)[p2}],  where 
g\  ::  (r  b  Ai)  and  a2  ::  (T,  Ai  b  A2)  and  t  ::  (T,  Ai,  A2  b  J))  continues  to  hold  in  £. 

Proof.  These  are  both  trivial  generalizations  of  the  arguments  in  §4.2.9.  Q 
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4.3.10  Type  isomorphisms 

We  can  extend  the  notion  of  type  isomorphism  to  negative  types: 

Definition  4.3.6.  We  say  that  two  negative  types  A~  and  B~  are  isomorphic  (A  «  B)  if  there  exist  a 
pair  of  value  transformers  f  ::  (x  :  B  \-  A)  and  ip  ::  (x  :  A  B)  which  are  inverses,  i.e.: 

1.  <j>[ljj/x\  =X:AhA  Idx 

2.  Ip[4>/x\  =x,b\-»b  Idx 

Proposition  4.3.7.  The  following  isomorphisms  of  negative  types  hold: 

A^(B^C)  «  (A^B)^C  A*$B  «  B*gA  A  «  A>$± 

A&(B&C)  ~  (A&B)&C  A&B  «  B&A  A  ~  A&T 

A>$(B&C)  «  (A>$B)&(A>$C)  A^T  «  T 
^(A&B)  ~  =i  A>£  B  =.  T  «  _L 

Proposition  4.3.8.  77ze  following  isomorphisms  of  mixed  polarity  types  hold: 


A^B- 

->C^{A®B) 

-►  C 

c- «  1  - 

►  C 

C+ 

mC-± 

( B&C )  « 

{A  — >  B)&(A  - 

-C) 

(A  ©5)  - 

-*•  Cft 

*  (A  — 

>  C)&(B  ->  C) 

IA®IB 

~  i(A&B) 

A  Att 

l(A  -  -L) 

A- 

T-B  ~ 

a®ab 

]A^]B 

~  T(A  ©  B ) 

s  t(l  -  A) 

|A  - 

i  ©  A^B 

A+  «  1  -  (A 

--L) 

A-  ~  (1 

-A)- 

-*■  © 

4.3.11  Environment  semantics 

Environments  for  £  are  defined  just  like  £+  -envi ronments  (§4.2.12),  as  lists  of  substitutions 
7  =  (<7i; . . .  <7n),  except  that  the  substitutions  <r,  now  can  contain  mappings  for  value  variables. 
Generalizing  Proposition  4.2.26,  given  a  E-environment  7  and  a  value  variable  x  :  A~  €  E,  we 
can  lookup  the  variable  in  the  environment  to  get  a  negative  value  lookup(7,  x)  ::  (r  h  A-).  We 
then  define  the  small-step  environment  semantics  for  arbitrary  C  programs  by  including  two 
additional  rules: 


(7  |  x  K)  ^  (7  |  lookup(7,  x)  |  K)  (lookup  ) 

(7  |  V  |  d[u})  (bind(7,cr)  |  V(d))  (bind/call-) 

which  can  also  be  refactored  as  a  single  rule: 

(7  |  x  (d[<r]))  (bind(7,  a)  |  lookup(7,  x)(d))  (go-) 

We  adopt  the  same  terminological  conventions  for  C  as  we  did  for  C  :  we  identify  the  total 
fragment  by  forbidding  the  use  of  El  or  15  and  requiring  the  use  of  total  recursive  functions  in 
the  definition  of  negative  values /positive  continuations,  and  we  identify  the  pure  fragment  by 
allowing  Q  and  partial  recursive  functions  but  forbidding  effects  like  15. 

Again,  we  can  state  strong  versions  of  the  usual  progress  and  preservation  lemmas. 
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Lemma  4.3.9  (Intrinsic  progress).  For  all  programs  P  in  pure  C ,  there  exists  a  P'  such  that  P  ^  P' . 
Lemma  4.3.10  (Intrinsic  preservation).  For  pure  C,  if  (7  |  E)  ^  (7'  |  E')  then  A  [7]  =  E'[ 7']. 
Corollary  4.3.11  (Functionality).  For  pure  C,  if  (7  |  E)  1)  11  then  E[ 7]  =  R. 

The  proofs  of  these  properties  are  trivial  generalizations  of  the  proofs  in  §4.2.12. 

4.3.12  Observational  equivalence  and  the  separation  theorems 

Observational  equivalence  is  defined  just  as  before:  two  C  expressions  E\ .  E2  ::  (r  h  #)  are 
observationally  equivalent  (E\  =  Ef)  if  they  yield  the  same  result  in  all  T-environments.  The 
relationship  between  observational  equivalence  and  definitional  equality  in  C  is  essentially  un¬ 
changed  from  the  situation  in  C+ .  Again,  it  is  easy  to  see  that  =  is  a  congruence,  and  somewhat 
less  direct  but  still  not  too  difficult  to  establish  that  given  enough  effects,  any  two  syntactically 
distinct  expressions  may  be  observationally  distinguished. 

Theorem  4.3.12.  If  E\  =  E2  then  E\  =  E2. 

Theorem  4.3.13  (Affine  separation).  For  pure  affine  expressions  of  L,  E\  =  E2  iff  E\  =<>  E2. 

Theorem  4.3.14  (Separation).  For  arbitrary  pure  expressions  of  C,  E\  =  E2  iff  E\  =Ur/w  E2. 

By  =u  and  =or/w  we  of  course  mean  observational  equivalence  in  the  extensions  of  C  with 
immediate  failure /ground  state.  The  proofs  of  these  theorems  are  trivial  generalizations  of  the 
analogous  results  for  £+,  though  we  must  first  extend  the  chronicle  representation  to  arbitrary  C 
expressions  and  substitutions  by  including  the  following  additional  (straightforward)  definitions: 

c  £  |cr|  c  €  \a  x'  d'\ 

(x,  d)  ■  c  G  |  (m,  d,  c)|  (xr,  d')  ■  c  e  |<r| 


4.4  Polarization  and  CPS  translations 

Near  the  end  of  Chapter  3,  we  drew  a  diagram  explaining  how  to  derive  different  double¬ 
negation  translations  of  classical  logic  via  polarization  and  focusing: 

polarized  logic 

classical  logic  _ _  minimal  logic 

Given  a  proof  of  a  classical  proposition  b,  we  can  polarize  b  any  way  we  like  as  a  PPL  proposition 
A  =  b*  (with  |  A  |  =  b),  and  via  the  completeness  of  focusing  obtain  a  focusing  proof  of  A.  This 
focusing  proof  can  also  be  interpreted  as  a  canonical  derivation  (depending  on  the  polarity  of 
A,  either  asserting  •  F  A~ ,  or  k  :  »A+  F  ff),  which  can  then  be  translated  directly  into  a  proof  of 
an  unpolarized  proposition  in  (the  conjunction-disjunction-negation  fragment  of)  minimal  logic. 
The  composition  of  these  two  steps  yields  different  double-negation  translations  of  classical  into 
minimal  logic,  depending  on  how  we  choose  to  polarize  the  original  classical  theorem. 

In  this  section,  we  reconsider  the  same  picture  from  a  computational  standpoint: 
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c 

A-calculus  _ _  CPS  A-calculus 

How  to  read  this  diagram?  Suppose  we  have  a  simply-typed  term  M  :  r  of  A-calculus  with  sums 
and  products.  We  can  polarize  r  any  way  we  like,  e.g.,  interpreting  products  as  <g>  or  &,  inserting 
shifts  l  and  j  in  random  places,  etc.  The  focusing  completeness  theorem  says  that  we  can  obtain 
a  focusing  proof  of  the  polarized  type  A,  which  we  can  then  interpret  as  a  term  of  C  (depending 
on  the  polarity  of  A,  either  a  closed  negative  value  or  an  expression  with  one  free  positive 
continuation  variable).  Finally,  this  £-term  can  be  reinterpreted  as  a  simply-typed  A-calculus 
term  in  continuation-passing  style,  resulting  in  a  CPS  translation  of  the  original  A-term.  Again, 
we  can  then  see  different  CPS  translations  as  arising  from  different  polarizations — although  the 
translation  isn't  entirely  determined  by  polarization,  because  (as  already  noted  in  Chapter  3) 
there  is  a  bit  of  ambiguity  in  the  computational  content  of  focalization,  particularly  whether  to 
use  left-to-right  or  right- to-left  evaluation  of  products.7 

4.4.1  From  A-calculus  to  C+  and  back 

Rather  than  jumping  in  with  full  generality,  we  will  begin  with  a  simple  example  where  the 
polarization  is  purely  positive,  and  where  the  result  of  chasing  the  diagram  is  a  well-known 
call-by-value  CPS  transformation.  A  similar  result  was  presented  by  Fuhrmann  and  Thielecke 
[2004],  who  showed  how  to  derive  this  particular  CPS  transform  as  a  composition  of  two  more 
primitive  translations  (motivated  by  the  categorical  semantics  of  call-by-value,  though,  rather 
than  by  proof  theory).  The  reader  might  revisit  Theorem  3.5.5  to  see  the  proof-theoretic  statement 
closely  corresponding  to  this  purely  positive  polarization. 

We  take  as  our  starting  point  the  A-calculus  with  sums  and  products  (without  atoms,  but 
those  could  be  easily  added  if  we  included  them  in  C+).  Types  are  given  by  the  following 
grammar: 

T  ::=  Ti  D  72  I  Ti  A  72  I  Tl  V  72  I  T  |  F 

and  terms  by  the  following,  with  the  standard  typing  rules: 

M  ::=  A x.M  \  M\  M2 

I  (Ml,  M2)  |  7Tl  M  |  772  M 
|  i\  M  |  l2  M  |  case(M,  y.M\,  z.M2) 

|  ()  |  abort(M) 

As  our  polarization  (— )*,  we  interpret  all  the  connectives  positively: 

7  One  aspect  of  this  analogy  that  may  need  a  bit  of  clarification  is  why  the  left  corner  of  the  logical  diagram 
contains  classical  logic  (and  in  particular  the  focusing  completeness  theorem  applied  to  classical  sequent  calculus) 
while  the  left  corner  of  the  computational  diagram  contains  A-calculus  (which  is  supposed  to  correspond  to  natural 
deduction  for  intuitionistic  logic).  To  be  clear,  we  can  apply  the  translation  A  — >  C  generally  for  A-calculus  with 
effects.  The  historical  genesis  of  the  CPS  translations,  after  all,  was  to  account  for  the  behavior  of  different  A-calculus 
evaluation  strategies  in  the  presence  of  non-termination  and  side-effects.  In  particular,  we  could  extend  the  translation 
A  — *■  C  to  A-calculus  with  the  control  operator  callcc,  corresponding  to  Peirce's  Law  and  classical  natural  deduction. 
As  for  our  choice  to  prove  the  completeness  theorem  for  sequent  calculus  rather  than  natural  deduction:  that  was 
only  to  relate  to  the  historical  view  of  focusing  as  a  sequent  calculus  search  procedure.  The  computational  content 
of  the  two  versions  of  the  theorem  (SC  vs.  ND)  are  not  identical,  but  very  similar. 
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(riDT2)*  =  ^*0^7-2*)  (n  A  t2)*  =  rf  0  r|  (n  V  r2)*  =  ©  r2*  T*  =  1  F*  =  0 

It  is  trivial  to  verify  that  this  is  indeed  a  polarization  (recalling  Definition  3.4.1),  in  particular 
because  t\  D  t2  and  ~(r2  A  ~r2)  are  classically  equivalent.  Now,  the  first  result  we  establish 
is  really  an  instance  of  the  focusing  completeness  theorem,  and  describes  how  to  translate  any 
A-calculus  term  into  a  corresponding  term  of  £ ' : 

Theorem  4.4.1.  For  any  A -term  T  b  M  :  r,  there  is  a  corresponding  total  C+ -expression  E  =  M*  n, 
where  E  ::  (T*,  n  :  *t*  b  #). 

Proof.  To  make  the  translation  more  readable,  we  use  the  notation  “E  where  k  =  K"  as  sugar  for 
the  composition  E[(K/k)\.  On  variables  and  function  abstraction/application,  the  translation  is 
defined  as  follows: 


x*  k  =  k  Idx 

(A x.M)*  k  =  k  ((x,n!)  t— >•  M*  n) 

(M\  M2)*  k  =  Mi  where  Hi  =  (n!  1— > 

Ml  ac2  where  k2  =  (x  1— > 
k'  PAIR (Idx,IdK))) 

(Note  that  in  the  A  x.M  case,  we  are  leaving  implicit  the  coercion  treating  the  t*0-iT2  -continuation 
as  a  -i(Tf  0-it!) -value,  and  in  the  Mi  M2  case,  we  are  making  use  of  the  combinator  PAIR  defined 
in  Proposition  4.2.5.)  On  pairing  and  projection  the  translation  is  as  follows: 

(Mi,  M2)*  k  =  Mi  Hi  where  =  (x  1— > 

k2  where  k2  =  (y  1— > 
k  PAIR (Idx,Idy))) 

(tti  M)*  k  =  M*  k  where  n  =  ((x,y)  1— ^  n  Idx) 

(7 r2  M)*  k  =  M*  n!  where  k'  =  ((x,  y)  1— »•  k  Idy) 

On  injection/ case-analysis  as  follows: 

(<-1  M)*  k  =  M*  k  where  n!  =  (y  1— >  k  (INLW^)) 

(t2  M)*  n  =  M*  k  where  k!  =  (z  k  (INR Idz)) 

(c&se(M,y.Mi,  Z.M2))*  n  =  M*  k  where  n!  =  (  inly  M*  k 

|  inr  z  e- >  M2  k) 

And  finally  on  unit/ abort  as  follows: 

()*«  =  «  0 

(abort (M))*  k  =  M*  ^  where  k'  =  abort 

where  abort  is  the  vacuous  O-continuation,  with  no  pattern  branches. 

A  couple  observations  about  this  translation: 

•  We  have  to  verify  that  the  translation  does  what  it  says  it  does,  i.e.,  that  the  C+  expres¬ 
sions  have  the  appropriate  type.  But  this  is  an  easy  mechanical  exercise — and  really  the 
translation  was  almost  entirely  determined  by  types  in  the  first  place. 
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Types  r  ::=  ~  r  |  t\  A  r2  |  t\  V  r2  |  T  |  F 
Terms: 

x:tGT  T.rrhM:#  T  h  Mi  :  ~  r  T  h  M2  :  r 
T  F  x  :  r  T  F  Ax.M  :  ~r  T  F  Mi  M2  :  # 

T  F  Mi  :  ti  T  F  M2  :  r2  THM  :  Ti  A  r2  T  F  M  :  n  A  r2 
T  F  (Mi,  M2)  :  n  A  t2  T  b  7Ti  M  :  n  T  F  7r2  M  :  t2 

T  h  M  :  n  T  F  M  :  r2  T  F  M  :  ti  V  r2  T,  y  :  ti  F  Mi  :  r'  T,  2:  :  r2  F  M2  :  r; 

T  F  ii  M  :  n  V  r2  T  F  t2  M  :  t\  V  r2  T  h  cas e(M,  y.M\,  Z.M2)  :  r' 

_  T  h  M  :  F 

T  I-  ()  :  T  T  h  abort (M)  :  r7 


Figure  4.1:  CPS-restricted  A-calculus 


•  The  choice  of  left-to-right  evaluation  order  in  the  translations  of  application  and  pairing  is 
not  determined  by  types,  only  by  convention. 


□ 

Our  next  step  will  be  to  give  a  general  translation  from  jC+  terms  to  the  fragment  of  A-calculus  in 
continuation-passing  style.  The  A-calculus  typing  rules  specialized  to  this  fragment  are  displayed 
in  Figure  4.1,  and  are  a  direct  annotation  of  natural  deduction  with  conjunction,  disjunction,  and 
minimal  negation  (Figure  3.7). 

Let  (— )m  be  the  translation  from  polarized  logic  into  minimal  logic  defined  in  §3.5.  We  recall 
its  action  on  positive  propositions,  and  on  assertions  or  refutations: 

(±,A)m  =  ~Am  (A®  B)m  =  Am\J  Bm  (A®B)m  =  AmABm  lm  =  T  0m  =  F 

(. A+)m  =  Am  {•A+)m  =  ~  Am 

The  following  statement  simply  gives  the  computational  content  of  Theorem  3.5.1,  specialized  to 
the  positive  fragment  (generalizing  it  back  to  cover  all  of  total  £  is  a  straightforward  exercise). 

Theorem  4.4.2.  For  any  total  CA  term  t  ::  (T  F  J)  there  is  a  corresponding  CPS  term  M  =  tm,  where 
Tml-M:  Jm. 

Proof.  To  define  the  translation,  we  need  two  auxiliary  translations: 

1.  For  any  pattern  p  ::  (A  IF  A),  there  exists  a  CPS  term  M  =  pm,  where  Am  F  M  :  Am. 

2.  Suppose  that  for  every  pattern  p  ::  (A  IF  A),  there  is  a  CPS  term  Mp,  where  Tm,  Am  F  Mp  : 
Jm.  Then  there  is  a  CPS  term  Ax.M  =  A (p  1— >  Mp ),  where  Tm,  x  :  Am  F  M  :  Jm. 

The  translations  (1)  and  (2)  are  defined  by  a  straightforward  induction  on  types  as  follows: 
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Type 

Translation  of  patterns  ( pm ) 

fa 

Km  =  K 

A®  B 

(pi,P2)m  =  (p?,P?) 

A®  B 

(inlp)m  =  l\  ( pm )  and  (inrp)m  =  i2  (pm) 

1 

om  =  0 

0 

no  patterns 

Type 

Translation  of  pattern-indexed  terms  (A (p  t— >•  Mp)) 

fa 

A  (k  1— »•  Mk)  =  A  k.Mk 

A®  B 

A((pi,P2)  M(pljP2))  =  A (y,z).M' 

where  Xy.Xz.M1  =  A(pi  A (p2  M(pip2f) 

A®  B 

A(inlpi  i->  M-m\Pl  |  inrp2  M\mp2)  =  Xx.case(x,y.Mi,  z.M2) 

where  Ay. Mi  =  A(pi  >—  Min \pi),  Xz.M2  =  A (p2  M-mrp2) 

1 

A(()  1  ^  M())  =  Xx.Mq 

0 

A0  =  Xx  .abort  (x) 

Note  that  in  the  <g>-clause  of  translation  (2),  A (y,  z).M'  is  syntactic  sugar  for  Xx.M'\tt\  x/y]  [ir2  x / z] . 
The  translation  of  arbitrary  C+  terms  then  proceeds  as  follows: 

(p[a])m  =  pm[am]  Km  =  A(p  1— >  K(p)m) 

(•)m  =  ()  (K/n)m  =  Km  (<7i,  a2)m  =  «\  a?)  (k  V)m  =  k  ( Vm ) 

where  pm  [crm]  is  a  bit  of  syntactic  high  fructose  corn  syrup,  standing  for  the  result  of  substituting 
the  components  of  the  tuple  am  for  the  appropriate  variables  in  prn.  (We  could  avoid  this 
notational  malnutrition  if  natural  deduction  were  better  fortified  with  first-class  substitutions.) 

□ 

We  have  a  translation  from  A-calculus  into  £+,  and  we  have  a  translation  from  C+  back  into 
CPS-restricted  A-calculus  (or  "CPS  calculus"  for  short).  So  what  happens  when  we  compose 
them?  In  Figure  4.2,  we  show  a  well-known  CPS  translation,  sending  a  A-calculus  term  M  :  r 
to  another  term  Mv  :  ~  ~  tv  ,  where  the  translation  on  types  is: 

(A  -*  B)v  =  ~(AV  A  ~  Bv)  {A  A  B)v  =  Av  A  Bv  {AM  B)v  =  Av  V  Bv  =  T  F1'  =  F 

Mv  is  essentially  the  CPS  transformation  described  by  Reynolds  [1972],  Fischer  [1972],  and 
Plotkin  [1975],  generalized  to  arbitrary  terms  of  A-calculus  with  sums  and  products.  And  indeed, 
we  can  derive  it  as  the  composition  (— )m  o  (— )*. 

Theorem  4.4.3.  If  T  b  M  :  r  then  T1’  h  Mv  :  ~  ~  rv . 

Proof.  Immediate  consequence  of  Theorems  4.4.1  and  4.4.2,  after  mechanical  verification  that 

p;  =  (p*)m  and  Mv  =  xK.{M*  K)m.  □ 

4.4.2  Reconstructing  call-by-value  and  call-by-name 

It  is  natural  to  consider  the  purely  negative  polarization  dual  to  the  one  in  §4.4.1  (and  close  to 
the  computational  content  of  Theorem  3.5.6): 

(n  D  T2)*  =  =1  (n  A  t2)*  =  7f&r£  (n  V  r2)*  =  rf’S’rJ  T*  =  T  F*  =  T 
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Action 

on 

types: 

(A-> 

B) 

v  =  ~(AV  A 

~BV)  ( AAB)V  = 

-  Av  f\Bv  (A  V  B)v  - 

=  A 

v  V  Bv  T 

Action 

on 

terms  (M  : 

t  ==>  Mv  :  ~  ~  tv ): 

xv  = 

A K.K  X 

(A  x.M)v  = 

Xn.n  (A (x,k').Mv  k1) 

(Mi  M2)v  = 

A  k.MI 

(Xf.M%  (Ax./ 

(x, 

m 

(M1,M2)v  = 

Ak.M/ 

(Xy. ( Xz.k 

*))) 

(TTi  M)v  = 

\k.Mv 

(A X.k  (7Ti  x)) 

(vr2  M)v  = 

A  k.Mv 

(A x.k  (tt-2  x)) 

(ti  M)v  = 

A  k.Mv 

(A x.k  (ii  x)) 

(12  M)v  = 

Xk.Mv 

(A x.k  (i2  x)) 

(c&se(M,y.Mi,  z.M2))v  = 

X  k.Mv 

(Ax.case(x,  y.M± 

k,z.M%  k)) 

()V  = 

X K.K  () 

(abort  {M))v  = 

A  k.Mv 

(Ax.abort(x)) 

Figure  4.2:  A  CPS  translation  of  call-by-value  functions,  strict  products  and  sums 


This  is  natural,  albeit  somewhat  artificial.  Such  a  global  dualization  dramatically  changes  the 
computational  interpretations  of  all  of  the  A-calculus  type  constructors,  and  not  all  of  these 
interpretations  occur  in  practice.  This  interpretation  of  A-calculus  sums  in  particular  is  very 
strange  (e.g.,  because  of  type  isomorphisms  like  A*8T  «  T),  and  although  such  interpretations 
do  appear  sometimes  in  the  literature  on  call-by- value/ call-by-name  duality,  I  would  claim  only 
as  artifacts  of  misdirected  attempts  to  treat  evaluation  order  as  a  global,  uniform  policy  decision 
independent  of  the  type  structure  of  a  language,  rather  than  as  a  local  policy  determined  by 
types.8  So  instead  of  reconsidering  the  entire  type  structure  of  the  A-calculus,  let  us  narrow  our 
attention  to  the  call  in  call-by-value/ call-by-name,  and  consider  specifically  the  different  possible 
polarizations  of  the  function  space,  and  how  they  come  about. 

As  a  logical  connective  in  §2.3.1,  and  as  a  type  constructor  in  §4.3.3,  we  introduced  a  very 
natural  negative  polarity  implication/ function  space.  The  type  A  — >  B,  with  A  positive  and  B 
negative,  is  defined  by  its  continuation  patterns  p@d,  where  p  is  an  ,4-value  pattern,  and  d  a 

/i-continuation  pattern.  Let  us  call  this  negative  connective - >  —  the  primordial  function  space. 

Given  the  properties  of  the  primordial  function  space,  we  can  understand  the  choice  to  use  call- 
by-value  or  call-by-name  semantics  uniformly  for  all  functions  in  a  program  as  corresponding  to 

8In  one  of  the  seminal  works  on  computational  duality,  for  example,  Filinski  [1989]  introduced  the  symmetric 
lambda  calculus,  and  defined  what  he  called  its  "pure  call-by-name  semantics",  corresponding  very  closely  to  this 
negative  polarization.  He  made  the  observation  that  "the  pure  CBN  coproduct  is  quite  different  from  the  one  found 
in  typical  lazy  programming  languages"  [ibid.,  §2.5.1],  but  argued  that  this  was  not  a  defect,  because  the  typical 
notion  could  be  encoded  by  delaying  the  two  branches  of  the  sum.  Essentially,  in  our  terminology,  Filinski  was 
arguing  that  lazy  sums  could  be  encoded  by  the  type  -<^A^^^B,  which  is  indeed  isomorphic  (by  Prop.  4.3.8)  to  the 
encoding  T(JA  ©  \ .B)  mentioned  in  §4.3.3.  As  I  argued  above,  though,  the  fact  that  two  types  are  isomorphic  does 
not  mean  it  is  particularly  natural  to  replace  one  by  the  other. 
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a  choice  to  give  all  types  the  same  polarity,  positive  or  negative.  Because  the  primordial  function 
space  is  fundamentally  mixed  polarity,  to  get  the  polarities  to  work  out  for  call-by- value/ call- 
by-name  function  spaces  we  have  to  insert  shifts  in  different  places.  Suppose,  for  instance,  that 
we  decide  to  interpret  all  types  positively,  as  we  did  in  §4.4.1.  How  do  we  make  use  of  the 
primordial  function  space?  The  primordial  function  space  will  gladly  accept  positive  types,  but 
it  returns  negative  types.  So  if  we  want  to  return  a  positive  type  B+,  it  must  be  shifted,  i.e., 
we  must  use  A  — »  |1?.  But  A  — >  ]B  is  itself  negative,  so  we  must  also  shift  the  entire  formula 
(or  else  we  could  not  have  higher-order  functions).  In  this  way,  we  derive  the  decomposition 
of  the  call-by-value  function  space  A  A  B  =  [[A  — >  |7i),  which  is  indeed  isomorphic  to  the 
polarization  A(A  ®  A  B)  we  discussed  in  §4.4.1.  Conversely,  if  we  choose  to  interpret  all  types 
negatively,  then  the  return  type  of  the  primordial  function  space  has  the  right  polarity,  as  well 
as  the  function  space  itself,  but  we  are  forced  to  shift  the  argument  type.  In  this  way,  we  derive 
the  decomposition  of  the  call-by-name  function  space  A^A  B  =  [A  — >  B,  which  is  isomorphic  to 
the  above  polarization  A  A*SB. 

Thus,  the  choice  to  use  uniformly  positive  or  negative  polarity  results  in  two  different  minimal 
annotations  of  the  primordial  function  space: 

A  A  B  =  l(A  — *•  ]B)  AJAB  =  IA-^B 

In  both  cases,  we  could  of  course  insert  extra  shifts  without  breaking  the  polarity  invariants — 
e.g.,  we  could  replace  [A  —>  B  by  \[(\A  — >  B) — but  these  are  the  fewest  we  can  get  away 
with. 

For  both  of  these  polarizations,  the  embedding  (— )*  of  A-calculus  into  C  is  almost  entirely 
determined  by  types,  which  then  determines  a  unique  CPS  translation  by  the  embedding  (— )m 
from  £  into  CPS  calculus.  For  the  call-by-value  polarization,  we  define  (— )*  as  follows  on 
variables,  function  abstraction,  and  application/’ 


(n  T-2)*  =  Kti  ->■  T t|) 

X*  K  =  K  Idx 

(A x.M)*  k  =  k  (x@k'  i — >  M*  k') 

(AI\  M2)*  k  =  Mi  where  =  (y  •— > 

M2  where  K2  =  (x  <— ► 

y  (Idx@IdK))) 

Note  this  is  almost  identical  to  the  definition  of  (— )*  we  gave  in  the  previous  section  for  the 
isomorphic  polarization  of  the  function  space  A(A  <g)  AB),  with  only  a  few  trivial  syntactic 
differences.  Indeed,  after  the  embedding  (— )m,  it  yields  the  same  CPS  translation,  Reynolds' 
original  call-by- value  one: 


xv  =  Xk.k  x 

(\x.M)v  =  \k.h  (A (x,k').Mv  k!) 

(Ml  M2y  =  A k.Mvi  (A f.M$  (Ax./  (x,k))) 

9Again,  here  we  cannot  yet  define  it  for  products  and  sums,  because  we  have  not  fixed  their  polarization,  only 
their  polarity. 
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Plotkin  [1975]: 


polarization  (n  D  72)*  =  TJUri  ~ >  r2) 


p 

x  =  x 

( Xx.M)p  =  A k.k  ( Xx.Mp ) 

(Mi  M2)p  =  A k.M[  (A/./  M2P  k) 

Streicher  and  Reus  [1998]:  polarization  (n  D  r2)*  =  |r*  — >  r| 

Q 

X  =  X 

(A x.M)s  =  A (x,n).Ms  k 
(Mi  M2)s  =  A/e.Mf  (M25,k) 


Figure  4.3:  The  Plotkin  and  Streicher  call-by-name  CPS  transformations 

Now  consider  the  call-by-name  polarization.  Because  r*  is  negative,  we  can  translate  M  :  r  di¬ 
rectly  to  a  negative  revalue  M* ,  without  having  to  abstract  in  a  continuation  variable  (cf.  Corol¬ 
lary  3.4.11  of  the  focusing  completeness  theorem):10 


(n  D  t2)*  =  It?  ->  r2* 

x*  =  Idx 

(Xx.M)*  =  x@d  t—y  M*(d) 

(Mi  M2)*  =  k^M1%  (M2*@Wk) 

Under  the  embedding  (— )m,  this  directly  yields  the  following  CPS  translation: 

xs  =  x 

(A x.M)s  =  A (x,k).Ms  k 
(Mi  M2)s  =  AAc.Mf  (M2s,/v) 

Note,  this  is  not  Plotkin's  original  call-by-name  translation:  this  is  what  is  sometimes  called  the 
Streicher  translation  (introduced  by  Lafont,  Reus,  and  Streicher  [1993],  and  later  studied  in  depth 
by  Streicher  and  Reus  [1998]  and  Hofmann  and  Streicher  [1997]).  Recall  the  Plotkin  translation: 

p 

x  =  x 

( \x.M)p  =  A  k.k  ( Xx.Mp ) 

(Mi  M2)p  =  Xu. Mi  (A/./  M2p  k) 

As  it  happens,  Plotkin's  translation  really  does  correspond  to  the  "inefficient"  negative  polar¬ 
ization  11(1^4  — >  B).  The  reader  may  try  reconstructing  this,  before  looking  at  the  next  three 

10 And  here  the  definition  of  (— )*  is  entirely  forced  by  types,  so  the  reader  can  try  reconstructing  it  before  reading 
on. 
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lines: 


(n  D  t2)*  =  ->  r|) 

x*  =  Idx 

(A x.M)*  =  k  i — >  k  ( x@d  i  ^  M*(d)) 

(Mi  M2)*  =  k  ^  Ml  •  (y  ^  y  (. M*@IdK )) 

The  syntactic  correspondence  between  this  polarization  and  the  Plotkin  translation  is  almost 
completely  apparent — it  becomes  completely  transparent  if  we  fill  in  the  missing  //-expansion  in 
the  abstraction  clause  of  the  Plotkin  translation: 

(A x.M)p  =  Xk.k  (XxX k' .Mp  k') 


4.4.3  Polarization  for  fun  and  profit 

We  have  not  exhausted  the  possibilities,  of  course.  There  are  infinitely  many  polarizations  of 
the  simple  types,  and  the  focusing  completeness  theorem  tells  us  that  each  of  these  results  in 
a  CPS  translation.  Whether  or  not  these  different  polarizations  are  interesting  is  another  story, 
but  it's  hard  to  rule  them  out.  Before  ending  the  chapter,  let's  consider  one  more  polarization 
of  the  A-calculus  function  space,  very  simple  but  unconventional: 


(n  D  r2)*  =  -.Tj*  0  t2* 

X*  K  =  K  Idx 

(A x.M)*  K  =  K  INL(s  ^  M*  k  ) 

where  k  =  (y  i-/  k  INR  (Idy)) 

(Mi  M2)*  k  =  Mi  k' 

where  k  =  (ini  k  i— /  M|  k  inrx  n  ldx) 

Here  we  have  chosen  to  polarize  the  A-calculus  function  space  not  using  the  negative  function 
space  A+  — >  B  of  C,  but  rather  with  the  positive  ~>A  0  B.  It  may  be  surprising  that  this  works: 
a  functional  value  is  either  a  continuation  for  its  argument,  or  the  value  of  its  result,  with  no 
dependence  between  them.  However,  because  the  polarization  is  positive,  A-calculus  terms  are 
not  translated  directly  into  values,  but  rather  to  expressions  with  a  free  continuation  variable. 
And  we  see  in  the  translation  (A x.M)*  above  that  the  expression  invokes  the  continuation  variable 
k  twice.  The  first  time,  it  calls  k  with  a  r*  continuation.  Within  that  continuation,  it  has  a  value 
x  of  type  n,  so  now  it  evaluates  M* ,  which  can  then  call  k  with  a  value  of  type  r|  depending 
on  x. 

Naturally,  if  the  body  of  the  term  A  x.M  does  not  actually  depend  on  x,  we  could  give  a  more 
efficient  translation. 


(A x.M)*  k  =  M*  k  where  k  =  (y  k  INR  (Idy)) 

By  continuing  down  these  lines,  can  we  give  a  logical  account  of  call-by -need? 


Ill 


4.5  Related  Work 


"Classical  Curry-Howard".  Since  Griffin  [1990]  drew  the  connection  between  control  operators 
and  classical  theorems  (e.g.,  callcc  and  Peirce's  Law),  and  Murthy  [1992]  and  Parigot  [1992]  gave 
hope  that  classical  logic  could  be  provided  a  decent  proof  theory,  there  have  been  many  different 
attempts  at  building  a  constructive  interpretation  of  classical  logic.  These  include  different  calculi 
by  Barbanera  and  Berardi  [1996],  Ong  and  Stewart  [1997]  and  Streicher  and  Reus  [1998],  among 
others.  Stewart's  dissertation  [1999]  represents  a  particularly  sophisticated  analysis,  trying  to 
judge  the  resulting  natural  deduction  for  classical  logic  by  the  sort  of  "internal  justification" 
principles  of  proof-theoretic  semantics  described  in  Chapter  2.  I  explained  the  sense  in  which  C 
is  and  is  not  such  a  Curry-Howard  interpretation  of  classical  logic: 

•  Any  classical  theorem  can  be  polarized  and  inhabited  by  a  term  of  C,  but. . . 

«...  the  meaning  of  positive  types  is  essentially  intuitionistic,  and  the  meaning  of  negative 
types  co-intuitionistic.  Proof-by-contradiction  is  confined  to  negative  types. 

In  other  words,  C  gives  a  propositions-as-types  interpretation  of  classical  logic  basically  in  the 
sense  of  being  a  syntactically  elegant  double-negation  interpretation.  This  may  not  seem  like  a 
very  satisfactory  interpretation — have  we  made  much  progress  on  understanding  the  construc¬ 
tive  content  of  classical  logic  since  Kolmogorov  [1925]? 

I  think  polarization  and  focusing  do  provide  some  insights.  On  the  one  hand,  they  tell  us 
that  classical  propositions  can  be  given  constructive  readings  which  are  not  very  different  from 
the  standard  intuitionistic  ones,  or  alternatively,  dual  to  them.  But  then  they  also  tell  us  that 
there  is  no  single  constructive  content  of  a  classical  theorem,  because  there  are  many  different 
ways  to  polarize  a  classical  proposition.  And  finally,  they  provide  a  constructive  interpretation 
of  duality,  which  is  an  important  and  useful  concept. 

Computational  duality.  Beginning  with  Filinski's  master's  thesis  [1989],  a  line  of  work  has 
explored  a  duality  between  call-by-value  and  call-by-name  evaluation  in  the  presence  of  first- 
class  continuations.  Filinski  was  inspired  by  categorical  duality,  but  did  not  make  a  connection 
to  logic.  The  logical  accounts  came  following  attempts  by  Girard  [1991a,  1993]  and  by  Danos, 
Joinet,  and  Schellinx  [1995,  1997]  to  understand  cut-elimination  for  classical  sequent  calculus 
through  the  lens  of  linear  logic  and  polarity.  Ogata  [2000]  showed  how  to  translate  Danos  et 
alia's  LKT  and  LKQ  into  a  CPS  calculus  (like  the  one  we  used  in  §4.4,  also  discussed  by  Thielecke 
[1997]),  and  Curien  and  Herbelin  [2000]  showed  how  LKT  and  LKQ  could  be  directly  annotated 
as  call-by-name  and  call-by- value  languages  (respectively).  Curien  and  Herbelin  also  defined  a 
larger  sequent  calculus/language,  of  which  CBN  and  CBV  were  fragments.  However,  this  larger 
calculus  was  non-confluent,  so  they  noted  one  can  obtain  confluence  by  imposing  a  global  bias 
towards  either  call-by-value  or  call-by-name.  The  same  approach  has  been  more  recently  refined 
and  exposited  by  Wadler  [2003]. 

Again,  in  retrospect,  it  seems  the  main  deficiency  of  these  languages  defined  by  Filinski 
[1989],  Curien  and  Herbelin  [2000],  and  Wadler  [2003]  was  that  call-by-value  and  call-by-name 
were  interpreted  as  global  policy  decisions,  rather  than  as  local  policies  determined  by  types. 
This  precluded  the  definition  of  a  single  language  subsuming  both  call-by-value  and  call-by- 
name  evaluation.  However,  such  a  type  distinction  is  implicit  (from  a  semantic  perspective)  in 
Selinger's  control  categories  [2001],  and  explicit  syntactically  in  Levy's  call-by-push-value  language 
[2001]. 
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Call-By-Push-Value.  In  the  Introduction  I  suggested  that  this  work  gives  a  proof-theoretic 
reconstruction  of  some  of  the  ideas  from  CBPV.  Let  me  try  to  explain  the  similarities  between  C 
and  CBPV,  the  differences,  and  what  the  proof-theoretic  approach  has  to  offer  beyond  what  is 
already  present  in  Levy's  account.  The  type  structure  of  CBPV  is  very  close  to  that  of  C: 

A  ::=  UB\J2Ai  \  Ax  A 

iei 

B  ::=  FA  |  B%  \  A  ->  B 

iei 

Types  A  are  value  types,  and  types  B  are  computation  types.  And  clearly  the  analogy  is, 

value  ~  positive  computation  ~  negative. 

Like  positive  types,  value  types  include  possibly  infinite  sums  (e.g.,  N)  and  finite  products  (<g>). 
Like  negative  types,  computation  types  include  possibly  infinite  products  (e.g.,  <S|B)  and  polarity 
mixing  functions  (— >).  The  coercions  U  and  F  play  the  role  of  .(  and  T,  respectively. 

The  difference  is  that  value  types  and  computation  types  are  not  given  symmetric  treatment  in 
CBPV,  whereas  positive  and  negative  types  are  completely  dual  in  C.  There  is  no  *8  connective 
in  CBPV,  nor  the  subtraction  operator  dual  to  implication.  More  to  the  point,  there  are  no 
continuation  variables  in  CBPV,  and  a  term  of  type  FA  does  not  necessarily  have  access  to  its 
continuation  (only  in  models  of  CBPV  with  control  operators).  Basically,  we  can  summarize 
the  difference  between  the  type  structure  of  C  and  the  type  structure  of  CBPV  as  the  difference 
between  polarizing  classical  logic  and  polarizing  intuitionistic  logic. 

Depending  on  your  perspective,  this  asymmetry  in  CBPV  may  be  seen  as  an  unnecessary 
restriction,  or  as  an  important  gain  in  expressivity.  But  in  either  case,  the  point  is  that  while 
this  difference  exists  between  CBPV  and  £  as  I  have  defined  it  here,  it  is  not  inherent  to  the 
ideas  of  polarity  and  focusing.  It  is  possible  to  give  an  asymmetric,  pattern-based  formulation  of 
focusing  for  intuitionistic  logic  (as  I  have  done  elsewhere  [Zeilberger,  2008,  Licata  et  al.,  2008]), 
and  then  the  type  structure  exactly  mirrors  that  of  CBPV.  And  conversely.  Levy  often  analyzes 
CBPV  in  terms  of  a  simpler  language  "jump-with-argument"  (JWA),  which  is  very  close  to  C+ . 

So  besides  this  subtle  but  ultimately  inessential  difference  in  symmetry,  what  does  our  proof- 
theoretic  analysis  add  to  the  CBPV  story,  besides  the  simple  fact  that  such  a  proof-theoretic 
analysis  exists?  Well,  although  the  types  of  CBPV  and  C  are  very  similar,  their  type  systems 
are  structured  quite  differently.  As  we  have  seen,  the  terms  of  C  correspond  to  derivations  in 
a  sequent  calculus,  subject  to  the  subformula  property,  and  are  defined  generically  by  reference 
to  patterns  and  variables.  CBPV,  in  contrast,  has  a  more  standard,  natural  deduction-style  def¬ 
inition,  with  introduction  and  elimination  rules  for  each  connective.  The  combination  of  the 
subformula  property  and  the  pattern-based  definition  is  important,  because  as  we  saw,  it  often 
allows  us  to  make  generic  arguments  about  programs  of  C  and  C  1 ,  without  having  to  consider 
individual  types.  For  example,  we  could  give  a  standard  syntactic  type  safety  argument  in 
the  style  of  Wright  and  Felleisen  [1994],  but  avoiding  much  of  the  usual  bureaucracy  of  such 
arguments.  (Our  type  safety  theorem  was  a  bit  non-standard  in  being  an  intrinsic  type  safety 
theorem,  but  we  will  also  give  an  extrinsic  one  in  Chapter  6.)  Likewise,  the  Separation  Theorem 
was  stated  and  proved  generically.  We  will  see  more  demonstrations  of  the  power  of  patterns 
in  Chapter  6  when  we  consider  refinement  types  such  as  intersections  and  unions,  which  have 
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not  been  studied  in  the  CBPV  setting.11 

Ludics.  Many  of  our  definitions  and  results  were  inspired  and  informed  by  ludics  [Girard, 
2001].  There  is  a  close  correspondence  between  the  terms  of  C  and  the  designs  of  ludics.  Roughly, 

locus  ~  variable  ramification  ~  frame 
positive  design  ~  expression  negative  design  ~  substitution 
cut-net  ~  program 

C  is  not  just  an  alternate  presentation  of  ludics,  though.  Our  definitions  were  motivated  by 
programming  languages  intuitions,  and  at  times  these  diverge  from  ludics.  Most  notably  in 
the  notion  of  pattern  itself,  which  is  absent  in  ludics.  Likewise,  C  has  first-class  notions  of 
value  and  continuation,  which  are  not  used  in  ludics.  The  chronicle  representation  (§4.2.14) 
and  the  compound  rules  for  the  environment  semantics  (go+  and  go-)  show  that  values  and 
continuations  are  strictly  speaking  unnecessary,  because  they  can  be  factored  out  into  expressions 
and  substitutions.  Nonetheless,  because,  they  match  the  usual  notions  of  value  and  continuation 
from  operational  semantics,  they  provide  useful  intuition  for  the  meaning  of  types. 

Ludics  has  been  an  inspiration  for  many  others  as  well.  Perhaps  most  closely  related  to  our 
work,  Terui  [2008]  has  recently  begun  exploring  a  "computational  ludics",  as  a  first  step  towards 
bridging  the  gap  between  computability/ complexity  theory  and  logic/ type  theory. 

Pattern-matching  and  realizability.  Pattern-matching  is  a  well-established  and  important 
feature  of  functional  programming  languages  [Augustsson,  1987,  Peyton  Jones,  1987].  The  higher- 
order  formulation  of  pattern-matching  given  here  as  an  iterated  inductive  definition  is  (as  far  as  I 
know)  new,  although  it  bears  some  similarities  to  the  approach  of  Coquand  [1992],  The  general 
idea  of  allowing  computation  in  syntax  is  not  really  new,  however.  In  a  sense  it  is  already 
present  in  the  BHK  realizability  interpretation  of  intuitionistic  logic,  with  the  idea  that  a  proof 
of  A  D  B  is  a  procedure  for  converting  proofs  of  A  into  proofs  of  B  [Heyting,  1974,  Kolmogorov, 
1932],  Note,  though,  that  our  interpretation  is  only  second-order,  because  patterns  are  first-order 
objects — whereas  the  realizability  interpretation  is  arbitrarily  higher-order.  Realizability  was  put 
into  practice  in  the  NuPRL  framework  [Constable  et  al.,  1986],  and  Howe  [1991]  explored  how 
it  leads  to  a  notion  of  "computational  open-endedness",  i.e.,  the  idea  that  even  in  constructive 
settings  the  meta-level  functions  are  somehow  arbitrary.  We  will  have  much  more  to  say  about 
this  in  the  next  chapter. 


11  It  is  worth  noting  that  a  notion  of  "ultimate  pattern"  similar  to  ours  has  shown  up  in  games  semantics  interpre¬ 
tations  of  CBPV  and  JWA  [Lassen  and  Levy,  2007].  But  then  why  exclude  it  from  syntax? 
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Chapter  5 

Concrete  notations  for  abstract 
derivations 


It  thus  became  apparent  that  the  "finite  standpunkt"  is  not  the  only  alternative  to 
classical  ways  of  reasoning  and  is  not  necessarily  implied  by  the  idea  of  proof  theory 
An  enlarging  of  the  methods  of  proof  theory  was  therefore  suggested:  instead  of  a 
restriction  to  finitist  methods  of  reasoning,  it  was  required  only  that  the  arguments  be 
of  a  constructive  character,  allowing  us  to  deal  with  more  general  forms  of  inference. 

— Paul  Bernays  (quoted  by  Buchholz  et  al.  [1981,  p.55]) 


In  the  previous  chapter  we  defined  a  new  language,  C,  but  our  aim  was  not  to  present  this 
language  for  its  own  sake,  but  rather  to  use  it  as  an  example  of  a  new,  more  abstract  approach 
to  language  definition  based  on  polarity  and  focusing.  What  made  this  approach  unusual  was 
that  syntax  was  given  a  second-order  definition:  certain  kinds  of  syntactic  objects  (positive 
continuations  and  negative  values)  were  constructed  as  functions  on  more  basic  (first-order) 
syntactic  objects,  patterns.  This  allowed  us  to  define  a  generic  notion  of  computation  while 
remaining  in  a  typed  setting,  deriving  the  operational  behavior  of  individual  type  constructors 
from  their  pattern-formation  rules. 

But  it  may  seem  a  bit  nebulous.  What  exactly  are  these  functions  on  patterns?  Because  a 
type  can  have  infinitely  many  patterns,  different  possibilities  exist:  primitive  recursive  functions, 
partial  recursive  functions,  functional  relations  in  ZFC  set  theory,  etc.  On  the  one  hand,  this 
computational  open-endedness  may  be  seen  as  a  virtue,  since  it  allows  our  language  to  be 
interpreted  in  many  different  settings.  But  on  the  other,  it  may  be  seen  as  a  sign  that  the 
language  definition  is  too  open-ended.  After  all,  syntax  is  something  that  ultimately  has  to  be 
written  down,  and  is  supposed  to  communicate  some  meaning  unambiguously.  How  do  we  write 
down  these  potentially  infinitary  functions  on  patterns,  and  how  do  we  interpret  them? 

In  this  chapter,  I  will  give  two  different,  very  concrete  answers  to  these  questions,  by  showing 
how  to  embed  programs  of  C  (or  at  least  the  C+  fragment)  in  two  different  meta-languages. 
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These  meta-languages — Agda1  and  Twelf2 — are  real,  working  programming  languages,  and  so 
we  inherit  a  concrete  syntax  for  writing  down  C  programs  in  such  a  way  that  they  can  actually 
be  type-checked  and  executed  by  machine.  Both  Agda's  and  Twelf's  core  type  theories  are 
intellectual  descendents  of  Martin-Lof's  dependent  type  theory  [Nordstrom  et  al.,  1990] — but 
down  two  very  different  paths:  the  type  theory  of  Agda  (iterated  inductive-recursive  definitions 
[Dybjer,  2000])  is  descended  from  Martin-Lof's  "theory  of  sets",  while  the  type  theory  of  Twelf 
(LF  [Harper  et  al.,  1993])  from  Martin-Lof's  "system  of  arities".  As  we  will  see,  in  each  of  these 
frameworks,  certain  aspects  of  C  can  be  represented  elegantly  and  concisely,  while  other  aspects 
have  to  be  encoded  more  indirectly.  For  our  present  purposes,  though,  this  is  an  advantage: 
by  manually  going  through  the  process  of  compiling  some  of  the  more  abstract  features  of 
our  language  definition  down  to  lower-level  primitives,  we  get  a  better  understanding  of  those 
features. 


5.1  An  embedding  of  C+  in  Agda 

In  this  section  we  will  show  how  to  represent  the  syntax  and  semantics  of  C+  in  Agda.  We 
restrict  to  C+  rather  than  C  because  it  conveys  most  of  the  essential  features.  Some  familiarity 
with  dependently  typed  functional  programming — such  as  in  Coq  or  NuPRL,  if  not  with  Agda 
in  particular — is  probably  a  prerequisite  for  understanding  this  section.  A  classic  introduction 
is  provided  by  Nordstrom  et  al.  [1990],  and  McBride  [2004]  provides  a  more  modern  one. 

The  first  thing  we  do  to  encode  the  syntax  of  £+  is  simply  define  a  grammar  of  types.  An 
open-ended  definition  would  be  desirable  if  we  were  using  the  language  for  real  programming, 
but  here  we  will  simply  copy  the  type  constructors  of  §4.2.1: 

data  Pos  :  Set  where 

_+_  :  Pos  ->  Pos  ->  Pos 
void  :  Pos 

:  Pos  ->  Pos  ->  Pos 

unit  :  Pos 
-i  :  Pos  ->  Pos 

bool  :  Pos 
nat  :  Pos 
dom  :  Pos 

infixr  15  _+_ 

infixr  16 

Pos  (the  Agda  type  representing  C+  types)  is  just  an  ordinary  algebraic  data  type.  (Agda  has 
some  neat  parsing  tricks,  so  we  tell  it  the  precedence  of  the  right-associative,  infix  product  and 
sum  constructors.)  Next  we  define  a  data  type  of  frames,  represented  as  join  lists  of  continuation 
holes: 

1The  Agda  code  included  in  this  chapter  are  actually  written  in  "Agda  2"  (v2.1.3),  the  redesign  and  reimple¬ 
mentation  by  Ulf  Norell  [2007]  of  an  older  language,  also  called  Agda.  At  the  moment,  Agda  2  is  in  active  de¬ 
velopment  and  evolving  rapidly,  so  the  code  included  here  may  or  may  not  be  syntactically  well-formed  a  few 
years/months  down  the  line.  Hopefully,  though,  it  will  still  make  sense.  A  good  source  of  information  about  Agda, 
http://wiki.portal.chalmers.se/agda/,  also  may  or  may  not  exist. 

2The  Twelf  [Pfenning  and  Schiirmann,  1999]  code  in  this  chapter  is  written  in  version  1.5.  Hopefully  when  you 
are  reading  this  the  Twelf  wiki  still  exists  at  http :  //twelf  .  plparty .  org/. 


--  binary  sums 
--  void 

--  binary  products 
--  unit 

--  continuations 
--  booleans 
--  naturals 
--  recursive  domain 


116 


data  Frame  :  Set  where 
•_  :  Pos  ->  Frame 
•  :  Frame 

:  Frame  ->  Frame  ->  Frame 
infixr  13 

Now,  we  want  to  encode  the  containment  relationship  for  frames.  In  Agda,  we  encode  it  as  an 
inductive  type  family  (note  that  the  arguments  of  the  V  quantifier  within  curly  braces  represent 
implicit  arguments  to  the  constructors,  which  Agda  will  attempt  to  infer): 

infix  10  _£_ 

data  _£_  :  Frame  ->  Frame  ->  Set  where 
here  :  V  {A}  ->  A  £  A 

left  :  V  {A  Ai  A2>  ->  A  £  Ai  ->  A  £  (Ai  ,  A2) 

right  :  V  {A  Ai  A2}  ->  A  £  A2  ->  A  £  (Ai  ,  A2) 

The  constructors  here  match  those  we  gave  in  §4.2.2  when  we  discussed  what  programming  in 
C+  would  look  like  without  variable  names.  A  proof  of  a  containment  Ai  £  A2  is  analogous  to 
a  de  Bruijn  index,  representing  the  path  we  take  to  get  from  A2  to  its  sublist  Ai.  For  example, 
we  could  represent  the  variable  k2  :  •  A  £  ( k \  :  »A.  k2  :  •£>,  K;>,  :  »C)  by  the  following  index: 

k2  :  V  {  A  B  C  }  ->  •  B  £  (•  A  ,  •  B  ,  •  C) 

K2  =  right  (left  here) 

So  what  we  are  starting  to  do  here  is  give  a  nameless  representation  of  C+ .  This  is  the  easiest 
path  towards  encoding  the  language  in  Agda — otherwise  we  would  have  to  explicitly  build 
much  of  the  machinery  for  dealing  with  variable  names,  and  it  would  complicate  the  encoding. 
However,  it  will  make  the  process  of  actually  writing  programs  in  our  embedding  more  tedious, 
because  we  have  to  reason  about  indices. 

Once  we  have  defined  frames,  we  can  define  patterns,  also  as  an  inductive  family: 

infix  10  _IF_ 

data  _IF_  :  Frame  ->  Pos  ->  Set  where 
hole  :  V  {A}  ->  •  A  IF  ->  A 
#<>  :  ■  IF  unit 
#inl  :  V  {A  A  B} 

->  A  IF  A 

->  A  IF  A  +  B 

#inr  :  V  {A  A  B> 

->  A  IF  B 

->  A  IF  A  +  B 

#pair  :  V  {Ai  A2  A  B} 

->  Ai  IF  A  ->  A2  IF  B 
->  Ai  ,  A2  IF  A  *  B 
#tt  :  ■  IF  bool 
#ff  :  ■  IF  bool 
#z  :  ■  IF  nat 
#s_  :  V  {A} 

->  A  IF  nat 
->  A  IF  nat 
#dn  :  V  {A} 

->  A  IF  nat 
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->  A  lb  dom 
#dk  :  V  {A} 

->  A  lb  -i  dom 
->  A  lb  dom 
infixr  15  #s_ 

These  constructors  let  us  build  patterns  much  the  same  way  as  we  did  in  Chapter  4,  except  that 
they  can  never  name  a  continuation  variable,  only  give  a  place  for  a  hole. 

Now  that  we  have  patterns,  we  are  almost  ready  to  define  the  terms  of  C+ .  We  first  need  to 
give  the  the  boring  definition  of  contexts,  and  containment  in  a  context: 

data  Ctx  :  Set  where 
:  Ctx 

:  Ctx  ->  Frame  ->  Ctx 
inf ixl  12  _ , 

infixr  10  _€£_ 

data  _££_  :  Frame  ->  Ctx  ->  Set  where 

xo  :  v  {A  r  A’}  ->  A  e  A>  ->  A  ee  r  , ,  A> 
xs_  :  v  {A  r  A’}  ->  A  ee  r  ->  A  ee  r  , ,  A> 

infixr  15  xS_ 

We  also  introduce  an  Agda  type  J,  which  simply  names  the  different  kinds  of  judgments: 

data  J  :  Set  where 
True  :  Pos  ->  J 
False  :  Pos  ->  J 
All_  :  Frame  ->  J 
#  :  J 

infix  10  All_ 

Finally,  we  can  define  a  type  family  T  h  J,  whose  inhabitants  will  be  the  terms  of  our  language: 
infix  8  _b_ 

codata  _b_  :  Ctx  ->  J  ->  Set  where 

Notice  we  mark  this  definition  as  "codata"  rather  than  "data".  This  tells  Agda  to  treat  the 
definition  that  follows  as  coinductive  rather  than  inductive,  and  thus  to  be  more  relaxed  when 
we  build  non-well-founded  terms.  Formally,  though,  we  don't  really  mean  to  restrict  to  productive 
terms  either,  because  we  want  to  be  able  to  define  continuations  by  arbitrary  partial  recursive 
functions  on  patterns.  Fortunately,  Agda  works  just  fine  as  a  partial  type  theory — it  will  warn 
us  when  a  term  is  not  productive,  but  still  accept  it. 

Let  us  move  on  to  the  definition  of  T  F  J.  First,  we  define  values  as  patterns  under  substi¬ 
tutions: 

_[_]  :  V  {r  A  A} 

->  A  lb  A  ->  r  b  All  A 
->  T  b  True  A 

Note  that  _  [_]  is  a  mixfix  operator:  the  parser  lets  us  write  values  as  p  [  a  ] .  Now,  we  define 
continuations  as  maps  from  patterns  to  expressions: 


118 


>  > 


Ah#) 


con_  :  V  {r  A} 

->  (V  {A}  ->  A  Ih  A  ->  r 
->  r  h  False  A 


This  is  really  the  key  trick.  We  are  giving  a  higher-order  definition  of  the  syntax  of  our  language, 
by  embedding  meta-level  (i.e.,  Agda-level)  functions  over  patterns  as  object-level  (i.e.,  £+-level) 
continuations.  Moreover,  we  are  making  essential  use  of  the  fact  that  the  meta-level  function 
space  is  dependent:  the  type  of  the  constructor  con  doesn't  just  tell  us  that  continuations  are  maps 
from  patterns  to  expressions,  but  that  they  map  -4-patterns  with  frame  A  to  expressions  in  a 
context  extended  by  A.  We  will  soon  explore  what  this  means  in  greater  depth,  with  examples. 
Now,  we  finish  the  definition  of  C+  by  defining  substitutions  and  expressions: 

--  combinators  for  substitutions 

sHole  :  V  {r  A}  ->  T  h  False  A  ->  T  h  All  •  A 

sNil  :  V  in  ->  T  h  All  ■ 

sJoin  :  V  IT  Ai  A21 

->  r  h  aii  Ai  ->  r  h  aii  a2  ->  r  h  ah  ax  ,  a2 


--  a  value  thrown  to  a  continuation  variable 
throw  :  V  {r  A} 

->  •  A  ££  r  ->  r  h  True  A 

->  r  h  # 


--  immediate  failure 

u  :  v  {r>  ->  r  h  # 

Before  considering  some  specific  examples,  let  us  make  some  general  observations,  mirroring 
the  ones  we  made  in  Chapter  4.  First,  we  can  formally  derive  the  analogue  of  Observation  4.2.3, 
showing  how  to  apply  a  substitution  to  a  continuation  variable  in  its  frame  to  obtain  a  continu¬ 
ation,  and  conversely,  how  to  construct  a  substitution  given  a  map  from  continuation  variables 
to  continuations: 

appSub  :  V  fX  A} 

->  T  h  All  A 

->  (V  {A}  ->  1  A  £  A  ->  T  h  False  A) 
appSub  (sHole  K)  here  ~  K 

appSub  sNil  ()  --  the  empty  frame  has  no  variables,  so  this  case  is  refuted 

appSub  (sJoin  <7\  er2)  (left  k)  ~  appSub  a  1  K 
appSub  (sJoin  a1  er2)  (right  k )  ~  appSub  <j2  n 

sub_  :  V  IX  A} 

->  (V  {A}  ->  •  A  £  A  ->  r  h  False  A) 

->  T  h  All  A 
sub_  {A  =  •}  f  sNil 

sub_  {A  =  Ai  ,  A21  f  ~  sJoin  (sub  \x  ->  f  (left  x))  (sub  \x  ->  f  (right  x)) 
sub_  {A  =  •  _}  f  sHole  (f  here) 

(Here  the  clauses  are  specified  with  ~  rather  than  =,  because  the  type  family  F  F  J  is  coinductive.) 
Similarly,  we  can  show  that  given  an  ^-continuation  and  an  .4-pattern  with  frame  A,  we  can 
obtain  an  expression  in  a  context  extended  by  A.  The  proof  is  quite  trivial,  since  it  just  reduces 
to  Agda  function  application: 
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appCon  :  V  ff  A} 

->  r  h  False  A 

->  V  {A}  ->  A  Ih  A  ->  r 

appCon  (con  K)  p  '  K  p 


Now,  let's  try  to  build  the  C+  terms  witnessing  the  identity  principles.  To  do  that,  we  need  some 
easy  lemmas  about  the  transitivity  of  containment: 

trans£  :  {A-|  A2  A3  :  Frame} 

->  Ai  €  A2  ->  A2  €  A 3  ->  Ai  e  A3 

trans£  Ai  here  =  Ai 
trans£  here  Ai  =  Ai 

transS  Ai  (left  A2)  =  left  (transG  Ai  A2) 

transg  Ai  (right  A2)  =  right  (transg  Ai  A2) 


transee  :  V  {Ax  A2  T>  ->  A-,  £  A2  ->  A2  ££  T  ->  A-,  ££  T 
transgg  x  (xO  y)  =  xO  (transg  x  y) 
transgg  x  (xS  y)  =  xS  (trans££  x  y) 

We  can  then  build  the  identity  continuation  and  the  identity  substitution,  with  mutually  recursive 
definitions: 

mutual 

IdCon  :  V  {T  A}  ->  •  A  ££  T  ->  T  b  False  A 

IdCon  k  ~  con  \p  ->  throw  (xS  k)  (p  [  IdSub  (xO  here)  ]) 


IdSub  :  V  -CT  A}  ->  A  ££  T  ->  T  b  All  A 
IdSub  {A  =  •  A}  n  sHole  (IdCon  n) 

IdSub  {A  =  •}  a  sNil 

IdSub  {A  =  Ai  ,  A2>  a  ~  sJoin  (IdSub  (trans££  (left  here)  a)) 

(IdSub  (trans££  (right  here)  cr)) 

These  definitions  mirror  the  definitions  in  §4.2.7,  except  for  the  bureaucracy  involving  context 
indices.  Agda  verifies  that  the  identity  terms  are  productive. 

We  won't  consider  the  binary  composition  principles  here — the  reader  can  consult  Ap¬ 
pendix  ??.  But  for  both  binary  composition  and  for  the  environment  semantics  (which  we 
define  below),  we  need  the  weakening  property  on  contexts.  As  I  explained  in  Chapter  4,  on 
the  representation  of  C+  using  scoped  variable  names,  weakening  comes  "for  free",  since  it  has 
no  effect  on  terms.  With  the  de  Bruijn-ish  representation,  though,  we  need  to  give  an  explicit 
proof  of  weakening.  To  even  state  weakening  (and  composition),  though,  we  first  need  to  define 
a  notion  of  splitting  of  contexts: 

data  split  :  Ctx  ->  Ctx  ->  Frame  ->  Ctx  ->  Set  where 
here  :  V  {A  T} 

->  split  (T  ,  ,  A)  T  A  •• 
skip  :  V  {T  IT  A  T2  A’} 

->  split  r  it  A  r2 
->  split  (T  ,,  A’)  IT  A  (r2  ,,  A’) 
assoc  :  V  {T  Ai  A2  IT  A  T2} 

->  split  (T  ,,  Ai  ,,  A2)  IT  A  T2 
->  split  (T  ,  ,  (  Ai  ,  A2  ))  IT  A  T2 
nil  :  V  {T  IT  A  T2} 
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->  split  r  Ti  A  r2 
->  split  (r  , ,  •)  ri  A  r2 

The  relation  split  T  Ti  A  T2  says  that  T  can  be  split  into  an  initial  context  Ti,  a  hole  plugged 
in  by  A,  and  a  trailing  context  T2.  The  weakening  property  of  variable  indices  says  that  if  T  can 
be  split  as  T  =  Ti(A),T2,  and  Ao  is  in  the  concatenation  of  Ti  and  T2/  then  Ao  is  in  T: 

_++_  :  Ctx  ->  Ctx  ->  Ctx 

Ti  ++  ••  =  IT 

it  ++  cr2  ,,  A)  =  gt  ++  r2)  ,,  a 

infixr  12  _++_ 

weakvar  :  V  {T  Ti  A  T2  Aol 

->  split  r  r i  a  r2  ->  Ao  <e<e  r i  ++  r2 
->  A0  ee  r 

The  proof  of  weakvar  is  not  so  interesting,  so  we  omit  it  here.  Once  we  have  it,  we  can  define 
weakening  of  terms  simply  by  crawling  through  and  applying  weakvar  to  any  variables: 

weaken  :  V  (T  IT  A  T2  J} 

->  split  r  it  A  r2  ->  it  ++  r2  i-  j 

->Th  J 

weaken  s  (p  [<r])  ~p  [  weaken  s  a  ] 

weaken  s  (con  ip)  ~  con  (\p  ->  weaken  (skip  s)  (tp  p)) 

weaken  s  (sHole  K)  ~  sHole  (weaken  s  K) 

weaken  s  (sNil)  ~  sNil 

weaken  s  (sJoin  07  er2)  ~  sJoin  (weaken  s  a i)  (weaken  s  cr2) 

weaken  s  (throw  n  V)  throw  (weakvar  s  k )  (weaken  s  V) 

weaken  s  15  ~  15 

Finally,  let  us  encode  the  C+  environment  semantics.  First  we  define  environments  and  the 
lookup  operation: 

data  Env  :  Ctx  ->  Set  where 
emp  :  Env  •  • 

_bind_  :  V  IT  A}  ->  Env  T  ->  T  b  All  A  ->  Env  (T  ,  ,  A) 
infixl  8  _bind_ 

lookup  :  V  {T  A} 

->  Env  T  ->  •  A  T  ->  T  h  False  A 
lookup  emp  ()  --  impossible  pointer  into  empty  environment 

lookup  (7  bind  a)  (xO  k)  =  weaken  here  (appSub  a  n) 
lookup  (7  bind  a)  (xS  k)  =  weaken  here  (lookup  7  k) 

And  then  we  define  programs,  results,  and  the  evaluation  relation: 

data  Prog  :  Set  where 

prog  :  V  {T}  ->  Env  T  ->  T  h  #  ->  Prog 

codata  Result  :  Set  where 
abort  :  Result 
step  :  Result  ->  Result 
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eval  :  Prog  ->  Result 

eval  (prog  7  (throw  k  (p  [  a  ] ) ) ) 

step  (eval  (prog  (7  bind  a)  (appCon  (lookup  7  k)  p))) 
eval  (prog  7  13)  abort 

Note  that  for  concision,  we  are  encoding  the  compound  go+  rule,  rather  than  the  separate 
lookup+  and  bind/call+  rules.  We  are  also  making  somewhat  more  precise  here  the  approach 
to  divergence:  in  the  style  of  Leroy  [2006],  Results  are  defined  coinductively,  and  divergence  is 
represented  by  an  endless  series  of  steps. 

At  last!  We  have  our  Agda  embedding  of  C+ ,  both  its  syntax  and  semantics.  Can  we  start 
writing  programs  now?  We  will  begin  by  writing  a  silly  one: 

selfapp  :  V  {T}  ->  T  h  False  dom 
selfapp  ~  con  K 
where 

K  :  V  {A}  ->  A  lh  dom  ,  A  h  # 

K  (#dk  hole)  ~  throw  (xO  here)  (#dk  hole  [  sHole  selfapp  ]) 

K  (#dn  _)  ~  O 

selfapp  is  the  self-applying  D-continuation  defined  in  §4.2.8.  Here  the  syntax  looks  more  or 
less  like  it  did  there,  except  that  we  have  to  reference  the  continuation  variable  with  a  de  Bruijn 
index.  Observe  that  we  only  give  two  clauses  when  defining  selfapp,  relying  on  Agda's  coverage 
checker  to  verify  that  the  meta-level  function  is  exhaustive  over  dom-patterns.  (The  #dn  case  can't 
be  ruled  out,  although  it  is  senseless  in  terms  of  the  intended  semantics  of  selfapp,  so  we  simply 
return  13  as  a  sort  of  error.  We  will  have  more  to  say  about  this  in  Chapter  6.)  Also  observe 
that  the  definition  of  selfapp  is  circular:  selfapp  is  used  as  an  embedded  substitution.  The 
definition  is  nonetheless  productive. 

By  applying  selfapp  to  itself,  we  can  write  the  diverging  program: 

u joj  :  Prog 

uu  =  prog  (emp  bind  sHole  selfapp) 

(throw  (xO  here)  (#dk  hole  [  sHole  selfapp  ])) 

Let's  try  to  verify  that  ww  really  diverges — or  at  least  that  it  runs  for  awhile. . .  We  start  by  defining 
a  procedure  that  decides  whether  a  program  is  safe  (i.e.,  does  not  abort)  for  n  steps: 

data  Bool  :  Set  where 
True  :  Bool 
False  :  Bool 

data  Nat  :  Set  where 
Z  :  Nat 

S_  :  Nat  ->  Nat 
infixr  12  S_ 

safeN  :  Result  ->  Nat  ->  Bool 
safeN  R  Z  =  True 
safeN  abort  (S  _)  =  False 
safeN  (step  R)  (S  n)  =  safeN  R  n 

We  could  just  run  safeN  on  lo uj  and  look  at  the  result,  but  since  we  are  working  in  dependent 
type  theory,  it's  more  fun  to  have  the  typechecker  do  it.  We  use  a  well-known  trick: 
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data  Void  :  Set  where 
data  Unit  :  Set  where 
u  :  Unit 

isTrue  :  Bool  ->  Set 
isTrue  True  =  Unit 
isTrue  False  =  Void 

isFalse  :  Bool  ->  Set 
isFalse  True  =  Void 
isFalse  False  =  Unit 

safelOwtu  :  isTrue  (safeN  (eval  u>u>)  (SSSSSSSSSSZ)) 
safelOww  =  u 

The  type  of  saf  elOcuu;  verifies  that  ujuj  runs  for  at  least  ten  steps. 

Now  consider  another,  slightly  less  silly  C+  term: 

isEven  :  V  -fT}  ->  •  bool  GG  T  ->  T  h  False  nat 
isEven  n  ~  con  K 
where 

K  :  V  {A}  ->  A  IF  nat  ->  _  , ,  A  b  # 

K  #z  throw  (xS  k )  (#tt  [  sNil  ]) 

K  (#s  #z)  ~  throw  (xS  k)  (#ff  [  sNil  ]) 

K  (#s  #s  n)  K  n 

This  is  the  function  of  type  N  — >  B  that  decides  whether  a  number  is  even,  expressed  as  a 
continuation  transformer.  We  can  also  define  a  generic  combinator  for  building  B  continuations, 
given  a  pair  of  expressions: 

branch  :  V  {T}  ->  T  h  #  ->  T  b  #  ->  T  b  False  bool 
branch  Ei  E2  ~  con  K 
where 

K  :  V  {A}  ->  A  IF  bool  ->  ,,  Ah# 

K  #tt  ~  weaken  here  Ei 
K  #ff  ~  weaken  here  E2 

Now,  consider  the  following  program: 

even9  :  Prog 
even9  =  prog  (emp 

bind  sHole  selfapp 

bind  sHole  (branch  (throw  (xO  here)  (#dk  hole  [  sHole  selfapp  ]))  13) 
bind  sHole  (isEven  (xO  here))) 

(throw  (xO  here)  (#s  #s  #s  #s  #s  #s  #s  #s  #s  #z  [  sNil  ])) 

This  might  be  a  bit  difficult  to  parse.  Ultimately,  even9  it  is  supposed  to  test  whether  the 
number  nine  is  even.  It  begins  by  binding  selfapp  in  the  environment,  and  then  defines  a 
bool-continuation  which  diverges  on  true  and  aborts  on  false,  and  binds  that.  Finally,  it  binds 
isEven  with  the  aforementioned  bool-continuation  as  its  return  address,  and  calls  isEven  on 
nine.  So,  how  does  this  actually  behave?  Well,  it's  safe  for  at  least  one  step: 

safeleven9  :  isTrue  (safeN  (eval  even9)  (S  Z)) 
safeleven9  =  u 
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Indeed,  it's  safe  for  two  steps: 

safe2even9  :  isTrue  (safeN  (eval  even9)  (S  S  Z)) 
safe2even9  =  u 

But  after  three  steps,  it  aborts: 

unsafe3even9  :  isFalse  (safeN  (eval  even9)  (S  S  S  Z) ) 
unsafe3even9  =  u 

This  might  be  a  bit  counterintuitive:  how  can  the  program  tell  that  nine  is  not  even  in  just  three 
steps?  The  point  is  that  the  second  step  is  really  a  large  step,  because  we  defined  isEven  using 
computation  at  the  meta-level.  The  continuation  immediately  decides  to  call  branch  with  #ff, 
which  then  decides  to  run  13. 

With  this  we  conclude  our  discussion  of  the  Agda  embedding.  The  reader  is  invited  to  try 
writing  some  programs  of  their  own. 


5.2  An  embedding  of  C+  in  Twelf 

Essentially,  the  Agda  encoding  is  a  direct  transcription  of  the  logical  rules  of  Chapter  2.  That  is, 
it  follows  the  definition  of  C+  almost  literally,  except  where  in  Chapter  4  we  introduced  some 
additional  conventions  for  referring  to  continuation  variables  by  labels.  We  could  have  given 
an  explicit  treatment  of  variable  names  (see,  e.g.,  Aydemir  et  al.  [2008]  for  a  survey  of  different 
possible  techniques),  but  this  would  have  necessitated  introducing  additional  scaffolding  which 
was  not  present  in  our  original,  albeit  informal  definition  of  C+.  We  did  not  do  anything  new 
in  the  above  code,  except  by  giving  more  explicit  proofs  of  some  lemmas  we  already  used  in 
Chapter  2. 

In  other  words,  the  dependent  type  theory  of  Agda  lets  us  take  seriously  our  informal  ideas 
about  computation  in  syntax,  but  refuses  to  take  seriously  our  informal  ideas  about  variable 
binding. 

In  Twelf,  the  situation  is  precisely  reversed.  As  mentioned  above,  the  type  theory  of  Agda  is 
inspired  by  Martin-Lof's  theory  of  sets,  which  includes  as  one  essential  feature  the  dependent 
function  space  I  lx  :  A.B.  The  type  II .x  :  A.B  represents  computations  from  terms  t  of  type  A 
to  terms  of  type  B(t).  In  contrast,  the  type  theory  of  Twelf  is  inspired  by  Martin-Lof's  system 
of  arities,  which  includes  a  different  dependent  function  space,  written  (x  :  A).B?  The  type 
( x  :  A).B  represents  a  substitution  function  fom  terms  t  of  type  A  to  terms  of  type  Bit).  A 
substitution  function  is  something  much  more  limited  than  a  computation:  it  cannot  inspect  its 
argument,  only  use  it  as  a  black  box. 

This  weakness  of  substitution  functions  in  comparison  to  computational  functions  is  also  a 
strength,  because  it  means  they  exactly  encode  the  notion  of  variable  binding:  the  type  ( x  :  A).B 
represents  a  B{x)  with  a  hole  for  an  x  :  A.  This  enables  a  technique  commonly  known  as  higher- 
order  abstract  syntax  [Pfenning  and  Elliott,  1988],  whereby  different  variable  binding  constructs 
in  the  syntax  of  a  language  are  directly  represented  by  the  LF  function  space.  But  it  also  means 
that  we  fundamentally  cannot  use  (x  :  A).B  to  represent  pattern-matching,  because  it  allows 

3  In  the  notation  of  Nordstrom  et  al.  [1990].  The  original  LF  paper  [Harper  et  al.,  1993],  as  well  as  subsequent 
papers  on  LF  and  Twelf,  use  the  same  notation  IIx  :  A.B  for  this  function  space. 
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no  inspection  of  x,  and  the  whole  point  of  pattern-matching  is  to  decompose /case-analyze  the 
argument.4 

So  can  we  give  an  embedding  of  C  in  Twelf,  without  introducing  a  lot  of  extra  scaffolding? 
Yes  we  can,  using  an  old  trick  due  to  Reynolds  [1972],  Defunctionalization  is  a  technique  for 
taking  a  higher-order  functional  program  and  turning  it  into  a  program  with  only  first-order 
functions.  The  rough  idea  is  to:  1.  Give  each  function  in  a  program  a  unique  tag  (i.e.,  some 
element  of  a  first-order  data  type),  2.  Pass  tags  around  instead  of  actual  functions,  and  3. 
Define  a  separate  "apply"  function,  which  describes  how  to  apply  tags  (denoting  functions) 
to  arguments.  A  slight  complication  is  that  function  bodies  may  reference  escaping  variables, 
so  defunctionalization  has  to  be  preceded  by  closure  conversion.  In  the  Twelf  code  below,  I 
won't  assume  prior  background  with  defunctionalization,  but  the  reader  may  consult  Danvy 
and  Nielsen  [2001]  for  a  good  introduction.  I  will  assume  some  familiarity  with  Twelf.  Again, 
the  Twelf  Wiki  is  a  good  source  of  information,  as  is  the  article  by  Harper  and  Licata  [2007], 

As  in  the  Agda  embedding,  we  begin  by  describing  the  syntax  of  types.  However,  in  Twelf,  it 
is  much  easier  to  define  the  syntax  of  general  recursive  types  because  we  don't  need  to  invent  a 
notion  of  type  substitution,  deriving  it  from  LF.  So,  we  give  a  somewhat  more  general  signature 
than  we  gave  in  the  Agda  embedding: 


pos  :  type, 
int  :  pos. 

+  :  pos  ->  pos  ->  pos. 

void  :  pos. 

*  :  pos  ->  pos  ->  pos. 

unit  :  pos. 

-i  :  pos  ->  pos. 

rec  :  (pos  ->  pos)  ->  pos. 


"/.  primitive  integers 
"/.  binary  sums 
"/.  void 

7„  binary  products 
"/.  unit 

"/.  continuations 
"/.  recursive  types 


“/.infix  right  13  +. 
“/.infix  right  14  *. 


Now  we  can  define  various  useful  derived  types  and  type  constructors: 


bool  :  pos 

=  unit  +  unit . 
nat  :  pos 

=  rec  [X]  unit  +  X. 
list  :  pos  ->  pos 

=  [A]  rec  [X]  unit  +  A 
— >  :  pos  ->  pos  ->  pos 

=  [A]  [B]  -»  (A  *  -i  B)  . 
"/.infix  right  12  — » . 

D  :  pos 

=  rec  [X]  X  X. 


"/,  booleans 
“/.  unary  nats 
“/,  cons  lists 

*  X. 

"/.  CBV-CPS  functions 

"/,  domain  D  =  D  ->  D 


Observe  that  we  are  defining  the  recursive  type  nat  in  addition  to  the  type  int  named  above. 

4Ironically,  the  very  different  technique  I  have  described  of  representing  object-level  pattern-matching  by  functions 
in  the  meta-lanaguage  could  also  reasonably  be  called  "higher-order  abstract  syntax".  This  was  done  by  Zeilberger 
[2008],  probably  a  bit  confusingly.  Both  forms  of  encoding  are  higher-order — but  one  uses  the  powerful  function 
space  Ila;  :  A.B  to  encode  pattern-matching,  and  the  other  uses  the  weak  function  space  ( x  :  A).B  to  encode  variable 
binding/substitution.  I  am  not  sure  of  the  best  terminology  to  unite  these  concepts. 
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The  C+  type  int  will  stand  for  "machine  integers",  so  to  speak,  represented  by  an  LF  type  i 
with  some  basic  arithmetic  operations  (here  we  suffice  with  addition): 

i  :  type, 
z  :  i . 

s  :  i  ->  i.  "/.prefix  10  s. 

add  :  i  ->  i  ->  i  ->  type. 

"/.mode  add  +M  +N  -P. 
add/ z  :  add  z  N  N . 

add/s  :  add  (s  M)  N  (s  P)  <-  add  MNP. 

"/.worlds  ()  (add  M  _  _)  . 

"/.total  (M)  (add  M  _  _)  . 

"/.unique  add  +M  +N  -P . 

add  is  just  the  usual  logic  programming  definition  of  addition,  which  Twelf  verifies  is  a  functional 
relation  (i.e.,  for  every  M  and  N,  there  is  a  unique  P  such  that  add  M  N  P).  In  the  code  below,  we 
will  appeal  to  add  when  defining  C+  continuations  over  ints.  In  contrast,  we  won't  assume  any 
operations  for  the  C+  type  nat,  which  stands  for  the  usual  recursive  datatype  definition. 

Now  that  we  have  some  types,  we  can  define  frames,  just  as  we  did  in  Agda: 

frame  :  type . 

•  :  frame. 

,  :  frame  ->  frame  ->  frame. 

•  :  pos  ->  frame. 

"/.infix  right  11  ,  . 

And  now  that  we  have  frames,  we  can  define  patterns: 

IF  :  frame  ->  pos  ->  type. 

"/.infix  none  9  IF. 


n  : 

i  ->  •  IF  int . 

ini  : 

DA  IF  A  ->  DA  IF 

A  +  B. 

inr  : 

DA  IF  B  ->  DA  IF 

A  +  B. 

u  : 

■  IF  unit . 

pair 

:  DAi  IF  A  ->  DA2 

IF  B  ->  DAi  ,  E 

fold 

:  DA  IF  A  (rec  A) 

->  DA  IF  rec  A. 

hole 

:  •  A  IF  -i  A. 

We  can  also  define  some  abbreviations  for  patterns  for  derived  types: 


tt  : 

•  IF 

bool  =  ini  u. 

ff  : 

•  IF 

bool  =  inr  u. 

zz  : 

•  IF 

nat  =  fold  (ini  u) . 

ss  : 

DA 

IF  nat  ->  DA  IF  nat  =  [p]  fold  (inr  p)  . 

"/.prefix  9  ss 

nil  :  •  IF  list  A  =  fold  (ini  u) . 

cons  :  DAi  IF  A  ->  DA2  IF  list  A  ->  DAi  ,  DA2  IF  list  A 
=  [pi]  [p2]  fold  (inr  (pair  pi  p2) ) . 

Now  comes  the  interesting  part — the  representation  of  C+  terms — where  the  Twelf  embedding 
will  really  diverge  from  the  Agda  embedding.  Instead  of  representing  variables  as  de  Bruijn-ish 
indices  into  an  explicit  context,  we  will  represent  them  as  actual  variables  in  the  implicit,  LF 
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context.  So,  we  declare  a  type  family  tm  J,  representing  derivations  of  J,  without  mentioning 
any  explicit  context: 

j  :  type. 

true  :  pos  ->  j.  “/.prefix  10  true, 
false  :  pos  ->  j.  “/.prefix  10  false, 

all  :  frame  ->  j.  “/.prefix  10  all. 

#  :  j. 

tm  :  j  ->  type . 

However,  because  we  want  the  ability  to  bind  a  list  of  multiple  variables  at  a  time  (during  pattern¬ 
matching),  which  is  not  directly  representable  by  LF  types,  we  define  an  auxiliary  judgment 
Ah  J,  which  simply  transfers  variables  into  the  implicit  context  one-by-one  (this  is  the  so-called 
"explicit  contexts"  technique  of  Crary  [2009]): 

b  :  frame  ->  j  ->  j  . 

"/.infix  right  9  b. 

A_  :  tm  J  ->  tm  (DA  b  J)  . 

A,  :  tm  (DAi  b  DA2  h  J)  ->  tm  (DAi  ,  DA2  h  J)  . 

Aeon  :  (tm  (false  A)  ->  tm  J)  ->  tm  (*  A  h  J) . 

Asub  :  (tm  (all  DA)  ->  tm  J)  ->  tm  (DA  b  J)  . 

“/.prefix  9  A_ .  "/.prefix  9  A,,  “/.prefix  9  Aeon,  "/.prefix  9  Asub. 

The  constructors  A_  and  A,  just  express  weakening  and  associativity  of  explicit  contexts.  Aeon 
and  Asub  do  the  interesting  work:  the  former  binds  a  continuation  variable  in  the  implicit  con¬ 
text,  while  the  latter  binds  an  entire  frame  as  a  substitution  variable.  Note  that  this  is  a  slightly 
simplified  view  from  Chapter  4:  we  are  not  distinguishing  continuations  and  continuation  vari¬ 
ables  as  syntactic  classes,  instead  viewing  continuation  variables  literally  as  LF  variables  for 
continuations.  This  means  that  there  will  be  more  terms  in  the  embedding  than  only  canoni¬ 
cal  C+  terms — basically  we  are  directly  stipulating  the  identity  principles,  rather  than  deriving 
them.  We  do  not  have  to  do  things  this  way,  but  it  lets  us  take  greater  advantage  of  the  logical 
framework.  In  particular,  substitution  for  variables  is  free. 

Now  we  begin  to  define  terms  in  the  implicit  context.  As  usual,  a  value  is  a  pattern  and  a 
substitution: 

val  :  DA  lb  A  ->  tm  (all  DA)  ->  tm  (true  A)  . 

But  observe  the  rule  does  not  mention  any  explicit  context.  Let  us  skip  over  continuations  for 
now — assuming  we  know  how  to  build  them,  we  can  also  build  substitutions: 

shole  :  tm  (false  A)  ->  tm  (all  •  A) . 
snil  :  tm  (all  •)  . 

sjoin  :  tm  (all  DAi)  ->  tm  (all  DA2)  ->  tm  (all  DAi  ,  DA2) . 

Finally,  we  define  different  kinds  of  expressions: 

throw  :  tm  (false  A)  ->  tm  (true  A)  ->  tm  #. 
let  :  tm  (all  DA)  ->  tm  (DA  b  #)  ->  tm  #. 

13  :  i  ->  tm  #. 
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Again,  here  there  is  a  subtle  difference  with  the  definition  in  Chapter  4:  we  are  directly  encoding 
the  composition  principles,  rather  than  only  allowing  expressions  of  the  form  k  V.  (We  are  also 
indexing  15  by  an  integer,  but  that  is  merely  to  have  some  more  interesting  observable  results  in 
the  examples  below.) 

At  last,  the  moment  of  truth  (or  rather,  the  moment  of  falsehood):  how  do  we  define  con¬ 
tinuations?  As  we  said,  the  trick  boils  down  to  defunctionalization.  What  that  means  here  is 
that  instead  of  giving  a  generic  constructor  for  inhabiting  the  LF  type  tm  (false  A)  by  maps 
from  patterns  to  expressions  (analogous  to  the  con  constructor  in  the  Agda  embedding),  we  will 
instead  treat  constants  of  type  tm  (false  A)  as  tags,  and  define  an  "apply  function"  that  does 
the  actual  work  of  associating — for  each  tag — patterns  with  expressions.  We  call  this  function 
body,  since  it  defines  the  body  of  the  continuation  denoted  by  each  tag.  Because  the  operational 
semantics  of  Twelf  are  based  on  logic  programming,  we  actually  declare  body  as  a  relation:5 

body  :  tm  (false  A)  ->  DA  lb  A  ->  tm  (DA  b  #)  ->  type. 

"/.mode  body  +K  +P  -E. 

Of  course,  since  body  is  only  declared  as  a  relation,  it  isn't  forced  to  behave  like  a  function — but 
with  the  appropriate  keywords,  we  can  have  Twelf  verify  that  it  is  in  fact  a  total  functional 
relation: 

“/(worlds  ()  (body  K  P  _)  . 

"/ototal  P  (body  K  P  _)  . 

"/.unique  body  +K  +P  -E. 

The  check  is  actually  trivial  at  this  point,  because  we  haven't  inhabited  the  LF  type  tm  (false 
A)  with  any  continuation  tags.  Twelf  allows  us  to  spread  out  the  definition  of  body  throughout 
the  file,  which  is  convenient  because  we  can  place  the  body  clauses  relevant  to  a  particular  tag 
immediately  after  its  declaration.  Then,  adding  “/.total  and  “/.unique  declarations  will  induce 
Twelf  to  check  that  our  pattern-matching  definitions  really  are  exhaustive  and  non-redundant. 

Before  we  define  any  particular  continuations,  though,  we  can  already  define  the  generic 
operational  semantics  by  appealing  to  body.  First,  we  define  results: 

result  :  type, 
halt  :  i  ->  result . 

Twelf  does  not  (yet)  support  coinductively  defined  relations,  so  for  simplicity  we  assume  that  the 
only  possible  result  is  termination  with  some  integer.  The  operational  semantics  itself  is  a  slight 
variant  of  the  environment  semantics:  rather  than  maintaining  an  explicit  environment,  we  use 
LF  substitution.  First,  we  define  the  (total  functional)  relation  load,  which  takes  a  substitution 
a  for  A,  a  term  t  in  a  context  extended  by  A,  and  returns  the  term  t[a\: 

load  :  tm  (all  DA)  ->  tm  (DA  b  J)  ->  tm  J  ->  type. 

"/.mode  load  +Scr  +T  -T’. 

ld/tm  :  load  Scr  (A_  T)  T. 

ld/join  :  load  (sjoin  Soy  S(r2)  (A,  T)  T’’ 

<-  load  S/j i  T  T’ 

3  The  Twelf  embedding  could  be  easily  adapted  to  functional  languages  based  on  LF,  such  as  Delphin  [Poswolsky 
and  Schiirmann,  2008]  or  Beluga  [Pientka  and  Dunfield,  2008].  In  that  case,  body  would  actually  be  a  (computational) 
function. 
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<-  load  Sir 2  T’  T”  . 

ld/con  :  load  (shole  K)  (Aeon  T*)  (T*  K) . 
ld/sub  :  load  S<7  (Asub  T)  (T  Scr). 

"/.worlds  ()  (load  _)  . 

“/.total  (Scr)  (load  Scr  _  _)  . 

“/.unique  load  +Scr  +T  -T. 

And  then  we  define  eval,  which  evaluates  an  expression  to  a  result: 

eval  :  tm  #  ->  result  ->  type. 

“/.mode  eval  +E  -R. 

ev/load  :  eval  (let  Scr  E)  R 
<-  load  Scr  E  E’ 

<-  eval  E’  R. 

ev/throw  :  eval  (throw  K  (val  P  Scr))  R 
<-  body  K  P  E 
<-  eval  (let  Scr  E)  R. 
ev/LS  :  eval  (13  N)  (halt  N) . 

“/.worlds  ()  (eval  _  _)  . 

“/.covers  eval  +E  -R. 

The  rules  ev/load  and  ev/throw  are  analogous  to  the  C+  environment  semantics  rules  lookup+ 
and  bind/call+,  except  that  they  don't  use  an  explicit  environment.  Case  ev/throw  is  where 
defunctionalization  comes  in:  instead  of  directly  computing  K(p)  =  E  by  evaluating  K  on  p, 
we  appeal  to  body.  The  eval  relation  is  not  in  general  going  to  be  total,  because  we  can  write 
non-terminating  programs.  However,  we  can  ask  that  eval  covers  all  the  different  kinds  of 
expressions.6 

With  that  we  are  done  encoding  the  generic  type  system  and  operational  semantics  of  C+  in 
Twelf,  and  can  now  consider  some  examples.  To  begin,  we  define  a  pair  of  simple  but  useful 
defunctionalized  continuations: 

ignore  :  tm  (false  A) . 

ignore/p  :  body  ignore  P  (A_  13  z)  . 

exit  :  tm  (false  int) . 

exit/n  :  body  exit  (n  N)  (A_  13  N)  . 

ignore  is  the  /4-continuation  that  ignores  its  argument  and  just  aborts  with  zero,  while  exit 
is  the  int -continuation  that  reads  an  integer  and  aborts  with  that  integer.  Observe  the  over¬ 
all  form  of  the  definition  of  continuations  in  defunctionalized  style:  we  declare  a  constant 
K  :  tm  (false  A),  and  then  give  clauses  inhabiting  body  K  PI  El,  body  K  P2  E2,  etc.  Here, 
we  only  used  a  single  body  clause  for  each  continuation,  and  we  can  verify  that  that  is  enough: 

“/.total  P  (body  K  P  _)  . 

“/.unique  body  +K  +P  -E . 

Nothing  forces  us  to  only  define  continuation  constants,  and  it  is  quite  useful  to  define  con¬ 
structors  that  build  defunctionalized  continuations  out  of  other  objects.  For  example,  given  two 
expressions  Ez  and  Enz,  we  can  build  an  int -continuation  iszero  Ez  Enz  that  executes  Ez  if 
its  argument  is  zero,  and  Enz  if  its  argument  is  non-zero: 
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iszero  :  tm  #  ->  tm  #  ->  tm  (false  int) . 
iszero/z  :  body  (iszero  Ez  Enz)  (n  z)  (A_  Ez) . 
iszero/nz  :  body  (iszero  Ez  Enz)  (n  (s  _))  (A_  Enz). 

Observe  that  now  we  are  using  two  body  clauses  to  define  the  continuation  by  pattern-matching. 
We  can  also  appeal  to  side-conditions  when  giving  the  body  clauses  themselves.  For  example, 
we  can  define  plus,  the  continuation  transformer  implementing  addition  of  ints,  by  appealing 
directly  to  "native  arithmetic",  i.e.,  the  relation  add  defined  above: 

plus  :  tm  (false  int)  ->  tm  (false  (int  *  int)). 

plus/mn  :  body  (plus  K)  (pair  (n  M)  (n  N))  (A_  throw  K  (val  (n  P)  snil)) 

<-  add  MNP. 

Compare  this  with  plus’,  the  corresponding  continuation  transformer  for  nats,  defined  in  terms 
of  an  auxiliary  continuation  transformer  succ: 

succ  :  tm  (false  nat)  ->  tm  (false  nat) . 

succ/n  :  body  (succ  K)  N  (Asub  [cr]  throw  K  (val  (ss  N)  cr)). 

plus’  :  tm  (false  nat)  ->  tm  (false  (nat  *  nat)). 
plus’/zn  :  body  (plus’  K)  (pair  zz  N)  (A,  A_  Asub  [cr] 
throw  K  (val  N  cr)). 

plus’/sn  :  body  (plus’  K)  (pair  (ss  M)  N)  (Asub  [cr] 

throw  (plus’  (succ  K))  (val  (pair  M  N)  cr) )  . 

Except  for  some  uninteresting  contortions  to  bind  substitution  variables  and  carry  them  through, 
the  body  of  plus  ’  corresponds  to  the  standard  recursive  definition  of  addition,  in  continuation¬ 
passing  style.  Finally,  we  define  a  conversion  function  from  nat  to  int,  using  an  auxiliary 
continuation  transformer  addl: 

addl  :  tm  (false  int)  ->  tm  (false  int) . 

addl/n  :  body  (addl  K)  (n  N)  (A_  throw  K  (val  (n  (s  N))  snil)). 

n2i  :  tm  (false  int)  ->  tm  (false  nat) . 

n2i/zz  :  body  (n2i  K)  zz  (A_  throw  K  (val  (n  z)  snil)). 

n2i/ss  :  body  (n2i  K)  (ss  N)  (Asub  [cr]  throw  (n2i  (addl  K))  (val  N  cr)). 

At  this  point,  it  is  worth  checking  again  that  all  of  our  definitions  are  exhaustive  and  non- 
redundant: 

"/.total  P  (body  K  P  _) . 

"/.unique  body  +K  +P  -E. 

Thankfully,  this  passes.  If  we  had  forgotten  a  case  (e.g.,  if  n2i/zz  were  commented  out),  the 
Twelf  coverage  checker  would  have  flagged  an  exception. 

Now,  let's  try  running  some  programs.  First,  we  try  evaluating  "2  +  2"  using  machine  integers 
(note  that  plus  exit  denotes  the  int* int -continuation  that  adds  its  two  arguments  and  then 
terminates  with  the  result): 

“/.query  1  * 

eval  (throw  (plus  exit) 

(val  (pair  (n  (s  s  z) )  (n  (s  s  z)))  (sjoin  snil  snil))) 

R. 
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And  indeed,  the  Twelf  server  answers  R  =  halt  (s  s  s  s  z).  Then  we  try  evaluating  "2  +  3" 
using  the  nat  datatype  (precomposing  exit  with  the  coercion  n2i,  to  convert  the  result  of  plus’ 
into  an  int): 

”/0query  1  * 
eval  (throw 

(plus’  (n2i  exit))  (val  (pair  (ss  ss  zz)  (ss  ss  ss  zz) )  (sjoin  snil  snil))) 

R. 

And  indeed,  R  =  halt  (s  s  s  s  s  z). 

To  end  this  chapter,  we  will  work  up  to  a  more  significant  example,  along  the  lines  of  §4.4, 
building  macros  that  let  us  view  direct-style  programs  in  call-by-value  lambda-calculus  (with 
strict  products  and  sums)  as  syntactic  sugar  for  C+  programs.  (Or  if  you  prefer,  we  are  building 
a  continuations  semantics  of  CBV  lambda-calculus  via  C+ .)  To  begin,  we  define  combinators 
for  defunctionalized  continuations,  corresponding  to  the  standard  sequent-calculus  left  rules  for 
products,  sums,  and  negation: 

fst  :  tm  (false  A)  ->  tm  (false  (A  *  B) ) . 

fst/xy  :  body  (fst  K)  (pair  PI  P2)  (A,  Asub  [ail  Asub  [cr2l  let  cr j  E) 

<-  body  K  PI  E. 

snd  :  tm  (false  B)  ->  tm  (false  (A  *  B) ) . 

snd/xy  :  body  (snd  K)  (pair  PI  P2)  (A,  Asub  [ail  Asub  [cr2]  let  a  2  E) 

<-  body  K  P2  E. 

case  :  tm  (false  A)  ->  tm  (false  B)  ->  tm  (false  (A  +  B)). 
case/inl  :  body  (case  K1  K2)  (ini  P)  El 
<-  body  K1  P  El. 

case/inr  :  body  (case  K1  K2)  (inr  P)  E2 
<-  body  K2  P  E2 . 

not  :  (tm  (false  A)  ->  tm  #)  ->  tm  (false  -1  A)  . 
not/k  :  body  (not  K)  hole  (Aeon  [k]  K  k) . 

We  also  define  some  combinators  that  allow  us  to  directly  convert  LF  functions  into  continuations: 

con  :  ({A}  A  IP  A  ->  tm  (all  A)  ->  tm  #)  ->  tm  (false  A)  . 
con/x  :  body  (con  K)  P  (Asub  [cr]  K  _  P  cr)  . 

conV  :  (tm  (true  A)  ->  tm  #)  ->  tm  (false  A) . 
conV/x  :  body  (conV  K)  P  (Asub  [cr]  K  (val  P  cr)). 

uncurry  :  (tm  (false  B)  ->  tm  (false  A))  ->  tm  (false  (A  *  -1  B)). 

uncurry/pk  :  body  (uncurry  K)  (pair  P  hole)  (A,  Asub  [aA  Aeon  [k]  throw  (K  k)  (val  P  Ui)). 

Note  we  will  use  con  and  conV  to  define  continuations  precisely  when  we  don't  need  to  do  any 
pattern-matching  on  the  argument.  As  usual,  we  can  verify  that  these  are  proper  definitions: 

"/ototal  P  (body  K  P  _)  . 

/(unique  body  +K  +P  -E. 

Now,  as  in  §4.4,  terms  of  CBV  lambda-calculus  will  be  represented  by  expressions  with  a  free 
continuation  variable.  We  define  an  LF  type  abbreviation  emp  A,  representing  lambda  terms 
computing  type  A: 
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”/0abbrev  cmp  :  pos  ->  type  =  [A]  tm  (false  A)  ->  tm  #. 


Then  we  define  the  canonical  injection  from  values  to  computations: 

”/0abbrev  Lift  :  tm  (true  A)  ->  cmp  A 
=  [x]  [k]  throw  k  x. 

The  left-to-right  pairing  operation: 

”/0abbrev 

Pair  :  cmp  A  ->  cmp  B  ->  cmp  (A  *  B) 

=  [el]  [e2]  [k]  el  (con  [_]  [pi]  e2  (con  [_]  [p2]  [a2] 

throw  k  (val  (pair  pi  p2)  (sjoin  o\  (72)))). 

The  first  and  second  projections: 

”/0abbrev 

Fst  :  cmp  (A  *  B)  ->  cmp  A 
=  [e]  [k]  e  (fst  k)  . 

”/0abbrev 

Snd  :  cmp  (A  *  B)  ->  cmp  B 
=  [e]  [k]  e  (snd  k)  . 

The  left  and  right  injections: 

"/.abbrev 

Ini  :  cmp  A  ->  cmp  (A  +  B) 

=  [e]  [k]  e  (con  [_]  [p]  [<Ti]  throw  k  (val  (ini  p) 

"/.abbrev 

Inr  :  cmp  B  ->  cmp  (A  +  B) 

=  [e]  [k]  e  (con  [_]  [p]  [a 2]  throw  k  (val  (inr  p)  CT2)). 

(Strict)  case-analysis: 

”/0abbrev 

Case  :  cmp  (A  +  B)  ->  (tm  (true  A)  ->  cmp  C)  ->  (tm  (true  B)  ->  cmp  C)  ->  cmp  C 
=  [e]  [f]  [g]  [k]  e  (case  (conV  [x]  (f  x)  k)  (conV  [y]  (g  y)  k) ) . 

(Call-by-value)  abstraction  and  application: 

”/0abbrev 

Fn  :  (tm  (true  A)  ->  cmp  B)  ->  cmp  (A  — >  B) 

=  [f]  Lift  (val  hole  (shole  (uncurry  [k’]  conV  [x]  (f  x)  k’))). 

”/0abbrev 

App  :  cmp  (A  — >  B)  ->  cmp  A  ->  cmp  B 
=  [f]  [e]  [k] 

f  (not  [kf] 

e  (con  [_]  [p]  [fJi ]  throw  kf  (val  (pair  p  hole)  (sjoin  <j\  (shole  k))))). 

The  natural  numbers,  represented  internally  by  int: 
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"/.abbrev 
Z  :  cmp  int 

=  Lift  (val  (n  z)  snil)  . 

"/.abbrev 

S  :  cmp  int  ->  cmp  int 
=  [e]  [k]  e  (addl  k)  . 

"/.prefix  9  S. 

Addition,  with  left-to-right  evaluation  order: 

"/.abbrev 

Plus  :  cmp  int  ->  cmp  int  ->  cmp  int 

=  [el]  [e2]  [k]  el  (conV  [nl]  e2  (conV  [n2] 

(Pair  (Lift  nl)  (Lift  n2))  (plus  k) ) ) . 

(Note  we  could  also  give  a  slightly  more  efficient  definition,  with  fewer  administrative  redexes:) 

"/.abbrev 

Plus*  :  cmp  int  ->  cmp  int  ->  cmp  int 

=  [el]  [e2]  [k]  el  (con  [_]  [nl]  [<n]  e2  (con  [_]  [n2]  [cr2] 
throw  (plus  k)  (val  (pair  nl  n2)  (sjoin  <j\  <72)))). 

And  finally,  two  effectful  computations,  which  simply  abort  (with  different  results): 

"/.abbrev 

AbortO  :  cmp  A 
=  [k]  15  z . 

"/.abbrev 

Abort  1  :  cmp  A 
=  [k]  15  (s  z)  . 

For  convenience,  we  define  a  type  abbreviation  run  t  n,  representing  the  fact  that  an  int- 
computation  terminates  with  result  n: 

"/.abbrev 

run  :  cmp  int  ->  i  ->  type  =  [t]  [n]  eval  (t  exit)  (halt  n) . 

We  can  now  run  direct-style  programs  directly  using  the  operational  semantics  of  C+ .  For 
example,  Twelf  answers  the  following  query  with  N  =  sssssz: 

"/.query  1  * 

run  (Plus  (S  S  Z)  (S  S  S  Z))  N. 

It  answers  the  following  with  N  =  s  z: 

"/.query  1  * 

run  (Plus  (S  S  Z)  Abortl)  N. 

And  the  following  with  N  =  z  (demonstrating  left-to-right  evaluation): 

"/.query  1  * 

run  (Plus  AbortO  Abortl)  N. 

For  the  big  finale,  we  evaluate  the  term  representing  "(Xx.x  +  1)  1": 

"/.query  1  * 

run  (App  (Fn  [x]  Plus  (Lift  x)  (S  Z))  (S  Z))  N. 

And  Twelf  answers  N  =  s  s  z. 
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5.3  Related  work 


The  idea  of  building  finitary  notation  systems  for  infinite  proofs  goes  back  to  the  days  of  Hilbert's 
project  and  the  w-rule  [Hilbert,  1931],  particularly  as  developed  by  Schiitte  [1950]  and  Shoenfield 
[1959].  Infinite  terms  were  also  considered  early  in  type  theory  by  Tait  [1965]  (with  a  term 
constructor  analogous  to  the  w-rule)  and  further  explored  by  Martin-Lof  [1972],  Since  then, 
more  systematic  connections  between  infinitary  and  finitary  proof/ type  theory  have  drawn  on 
the  work  of  Mints  [1978],  who  explained  how  to  view  cut-elimination  on  infinite  objects  as  a 
continuous  operation  that  could  be  reflected  back  to  finite  objects  [Mints,  2000,  Buchholz,  1991, 
1997,  Schwichtenberg,  1998].  Note  though  that  these  approaches  mainly  take  advantage  of  the 
idea  of  proofs/terms  as  infinitely  wide  objects  (what  I  would  call  derivations  for  types  with 
infinitely  many  patterns).  The  idea  of  proofs /terms  as  infinitely  deep  objects  (what  I  would  call 
derivations  for  types  with  a  non-well-founded  definition  ordering)  has  also  been  taken  seriously, 
drawing  on  the  work  of  Aczel  [1988]  on  non-well-founded  sets.  Coquand  [1994],  for  example, 
has  considered  infinite  objects  in  type  theory  defined  through  guarded  recursive  clauses.  More 
recently,  Brotherston  and  Simpson  [2007]  have  introduced  a  sequent  calculus  of  cyclic  proofs,  and 
explored  its  practical  application  in  inductive  theorem  proving. 
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Chapter  6 

Refinement  types  and  completeness  of 
subtyping 


But  why  accept  the  counterexample?  We  proved  our  conjecture — now  it  is  a  theorem. 
I  admit  that  it  clashes  with  this  so-called  "counterexample".  One  of  them  has  to  give 
way.  But  why  should  the  theorem  give  way,  when  it  has  been  proved?  It  is  the 

"criticism"  that  should  retreat.  It  is  fake  criticism _  It  is  a  monster,  a  pathological 

case,  not  a  counterexample. 

— Delta,  speaking  in  Imre  Lakatos'  Proofs  mid  Refutations 


6.1  Introduction 

In  Chapter  4  we  built  a  language,  C,  by  placing  our  faith  stubbornly  in  the  proofs-as-programs 
correspondence — programs  of  £  are  literally  derivations  in  polarized  logic,  and  their  seman¬ 
tics  are  literally  defined  by  composition  of  derivations,  a.k.a.  cut-elimination.  The  real  work 
in  making  this  dogma  practical  was  developing  a  type-free  notation  for  C  amenable  to  func¬ 
tional  programmers.  The  notation  we  gave  was  hopefully  largely  familiar  to  ML  or  Haskell 
programmers — who  program  happily  with  pattern-matching  and  arbitrary  mutual  recursion — 
except  for  the  novel  twist  of  continuation  patterns. 

What  does  it  mean  that  the  notation  is  type-free?  Because  C  programs  are  intrinsically  typed, 
the  type  of  a  program  is  a  promise  about  how  the  program  will  be  structured.  For  example,  the 
way  we  define  a  (negative)  value  of  type  N  — >  |N  is  by  presenting  a  map  from  N-patterns  to  j N- 
values.  The  type  tells  us  we  need  not  clutter  our  syntax  by,  say,  considering  B-patterns.  On  the 
other  hand,  this  is  not  telling  us  very  much.  There  are  potentially  many  other  things  we  would 
like  to  know  about  a  function  on  the  natural  numbers.  For  example,  whether  it  is  bounded,  or 
linear,  or  monotonic.  At  the  very  least,  we  might  like  to  know  that  it  always  terminates,  or  that 
it  never  aborts. 

This  is  the  motivation  behind  refinement  types.  Originally  introduced  by  Freeman  and  Pfen¬ 
ning  [1991]  in  the  context  of  ML,  the  idea  was  to  develop  a  second,  more  refined  type  system 
on  top  of  standard  Hindley-Milner  inference,  capable  of  expressing  and  automatically  verifying 
more  precise  invariants  of  programs — but  without  changing  the  language  itself.  A  succession  of 
Pfenning's  students  pursued  this  line  of  research  in  their  dissertations,  including  Tim  Freeman 
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[1994],  Hongwei  Xi  [1998],  Rowan  Davies  [2005],  and  Joshua  Dunfield  [2007],  Along  the  way, 
an  observation  was  made  that  became  the  original  motivation  for  this  thesis:  as  refinement  type 
systems  become  more  and  more  precise,  they  become  sensitive  to  the  evaluation  order  of  the 
original  language. 

This  was  in  some  ways  surprising,  and  in  some  ways  unsurprising.  Surprising  because 
of  examples  like  Haskell  and  ML,  which  use  different  evaluation  order  but  share,  at  least  to 
a  first  approximation,  the  same  underlying  type  system.  On  the  other  hand,  not  completely 
surprising  because  of  artifacts  like  the  value  restriction,  specific  to  ML,  which  emerged  after 
painful  years  of  trying  to  understand  the  interaction  of  polymorphism  with  effects.  In  early 
compilers,  programs  like  the  following  clearly  senseless  one  passed  the  typechecker,  with  the 
variable  x  given  polymorphic  type  (’a  list)  ref: 

let  val  x  =  ref  [] 

in  (x  :=  ["hello  world!\n"];  hd  ( ! x)  +  2) 
end 

This  was  thought  to  be  an  issue  peculiar  to  mutable  references,  and  various  patches  to  typecheck¬ 
ing  were  proposed.  A  non-exhaustive  list  of  people  who  worked  specifically  on  the  problem 
of  combining  polymorphism  with  references  includes  MacQueen  [1988],  Tofte  [1988],  Leroy  and 
Weis  [1991],  Wright  [1992],  Hoang,  Mitchell,  and  Viswanathan  [1993],  Greiner  [1993],  and  Talpin 
and  Jouvelot  [1994],  However  the  problem  with  polymorphism  was  more  pervasive  than  origi¬ 
nally  realized — not  confined  to  its  interaction  with  references  but  with  effects  more  generally — as 
Harper  and  Lillibridge  [1991]  exhibited  by  giving  another  unsound  but  (at  the  time)  well-typed 
program,  using  SML/NJ's  callcc  feature.1 

The  value  restriction,  proposed  by  Wright  [1995]  and  eventually  adopted  in  the  defini¬ 
tion  of  Standard  ML  [Milner  et  al.,  1997],  ruled  out  all  of  these  programs  by  limiting  the 
situations  in  which  polymorphic  generalization  could  be  applied:  for  x  to  be  generalized  in 
let  val  x  =  el  in  e2  end,  el  must  be  a  polymorphic  value.  This  rules  out  the  senseless  pro¬ 
gram  above,  for  example,  because  ref  []  is  not  a  value  (at  run-time  it  evaluates  to  a  fresh 
location  storing  the  empty  list).  It  is  important  to  see,  though,  that  the  reason  this  restriction 
on  typing  is  necessary  is  not  only  the  presence  of  effects:  it  is  also  the  eager  semantics  of  the 
let  construct.  If  instead,  el  were  substituted  wholesale  for  x,  without  first  evaluating  it  down 
to  a  value,  then  no  value  restriction  would  be  necessary.  For  instance,  under  such  semantics, 
the  expression  hd  ( !  x)  +  2  in  the  above  program  would  attempt  to  take  the  head  of  an  empty 
list — which  would  be  strange  and  raise  an  exception,  but  would  not  be  unsound. 

Given  this  background,  it  should  then  not  have  been  too  surprising  when  Davies  and  Pfen¬ 
ning  [2000]  discovered  that  the  standard  typing  rules  for  intersection  types — invented  in  the  late 
'70s  by  Coppo  and  Dezani-Ciancaglini  [1978]  and  independently  by  Salle  [1978] — were  unsound 
for  ML.  After  all,  intersection  types  are  a  form  of  finitary  polymorphism.  Thus  Davies  and 
Pfenning  found  that  a  value  restriction  on  intersection  introduction  was  necessary.  Somewhat 
unexpectedly,  they  also  found  that  a  standard  rule  of  subtyping, 

(A  ->  B)  n  (A  -»•  C)  <  A  -*  (B  n  C) 

was  unsound  for  call-by-value  functions  in  the  presence  of  effects.  The  reason  for  this  turns 
out  to  be  precisely  the  same  as  for  the  value  restriction,  but  might  be  more  psychologically 
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disturbing,  since  it  seems  to  be  a  statement  about  what  intersection  types  and  function  types 
mean,  independently  of  the  syntax  of  the  language. 

In  an  even  more  bizarre  turn,  Dunfield  and  Pfenning  [2004]  discovered  an  entirely  new  eval¬ 
uation  context  restriction  when  investigating  the  theory  of  (untagged)  union  types  in  ML.  Earlier, 
in  an  effect-free  setting,  Barbanera  et  al.  [1995]  had  studied  union  types  with  this  elimination 
rule: 

The:AuB  T,  x  :  A  h  e'  :  C  T,x:B\-e':C 
r  h  e'[e/x\  :  C 

The  rule  allows  eliminating  arbitrarily  many  occurrences  of  an  expression  e  of  union  type,  by 
discriminating  the  union.  But  Dunfield  and  Pfenning  [2004]  found  this  was  unsound  in  the 
presence  of  effects,  because  the  different  occurrences  of  e  could  evaluate  to  a  value  of  type  A  or 
B  nondeterministically.  Therefore  they  proposed  the  unusual  "tridirectional"  rule,  schematized 
by  evaluation  contexts  E[]: 

The:AuB  T,  x  :  A  h  E[x\  :  C  T,x:  BV~  E[x]  :  C 
T  b  E[e]  :  C 

This  rule  only  allows  eliminating  a  single  occurrence  of  e,  in  evaluation  position.  Although 
Dunfield  and  Pfenning  did  not  mention  it  explicitly,  the  reasons  for  the  evaluation  context 
restriction  also  imply  that  standard  laws  [(A  — >  C)  n  (B  — >  C)  <  {A  U  B)  — *  C  and  T  <  X  — »  C] 
are  unsound  for  call-by -name  functions  in  the  presence  of  effects  (the  latter  even  when  the  only 
effect  is  non-termination). 

All  of  these  restrictions  were  discovered  by  "disillusionment",  so  to  speak,  in  the  sense  that 
the  messy  world  of  side-effects  provided  counterexamples  to  simple  but  naive  rules.  Yet,  bitter 
experience  is  only  a  poor  substitute  for  explanation,  and  understandably  we  may  get  the  feeling 
that  policies  such  as  the  value  and  evaluation  context  restrictions  amount  to  ad  hoc  monster¬ 
barring.  The  central  aim  of  this  chapter  is  to  explain  why  phenomena  such  as  the  value  and 
evaluation  context  restrictions,  as  well  as  evaluation-order-dependent  subtyping  laws,  need  not 
be  policies  imposed  after-the-fact,  but  instead  can  arise  synthetically  from  a  logical  view  of 
refinement  typing.  Our  goal  is  not  only  to  better  understand  existing  choices,  but  to  develop  a 
theoretical  framework  that  narrows  the  design  space  for  future,  more  expressive  type  systems 
for  effectful  programming  languages. 

We  will  pay  special  attention  to  the  question  of  subtyping,  particularly  through  a  view  of 
subtyping  I  call  the  identity  coercion  interpretation.  The  idea  of  this  interpretation  is  to  equate 
subtyping  relationships  with  typings  of  an  identity  coercion,  i.e.,  whenever  we  assert  S  <  T, 
we  actually  have  a  typing  derivation  showing  that  under  the  assumption  that  x  has  type  S,  the 
reconstructed  value  ldx  has  type  T.  This  idea  is  not  entirely  new — a  more  traditional  way  of 
expressing  the  identity  coercion  interpretation  is  that  subtyping  is  witnessed  by  //-expansion,  as 
was  discussed  for  example  by  Brandt  and  Henglein  [1998] — but  I  think  it  is  underappreciated. 
It  has  several  advantages  over  more  typical  definitions  of  subtyping  (such  as  those  based  on  an 
axiomatization,  or  on  a  value  inclusion  interpretation),  including: 

•  The  question  of  deciding  subtyping  relationships  is  reduced  to  the  question  of  type  check¬ 
ing.  There  is  no  need  for  a  separate  axiomatization,  or  a  separate  decision  procedure. 

•  The  subtyping  relationship  is  sound  by  construction:  it  is  no  stronger  than  typing. 
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•  Conversely,  the  identity  coercion  interpretation  imposes  a  useful  methodological  constraint 
when  designing  a  type  system:  it  must  be  strong  enough  to  derive  all  expected  subtyping 
laws! 

The  last  point  is  an  important  one  in  my  opinion.  For  example,  to  respect  the  identity  coer¬ 
cion  interpretation,  I  believe  (although  I  won't  attempt  to  prove  formally)  that  any  type  system 
for  refining  algebraic  datatypes  with  intersections  and  unions  must  deal  directly  with  pattern¬ 
matching  in  some  form  or  another.  Essentially,  pattern-matching  provides  the  witness  to  the 
expected  distributivity  properties  of  intersections  and  unions  through  sums  and  products  (laws 
such  as  (A  ©  B)  n  {A  ©  C)  <  /I  0  ( B  n  C),  etc.),  which  seems  difficult  or  impossible  to  capture 
with  natural  deduction-style  typing  rules. 

On  the  other  hand,  when  a  subtyping  relationship  fails  by  this  interpretation,  all  we  know  is 
that  the  identity  coercion  fails  to  type  check.  In  contrast,  the  counterexamples  found  by  Davies 
and  Pfenning  [2000]  and  by  Dunfield  and  Pfenning  [2004]  demonstrated  that  certain  subtyping 
relationships  are  unsound.  To  better  understand  those  counterexamples,  we  consider  another 
notion  of  subtyping  called  the  no-counterexamples  interpretation.  The  idea  here  is  that  S  ^  T 
holds  if  any  value  of  type  S  can  be  safely  combined  with  a  continuation  accepting  type  T.  The 
relationship  ^  is  effectively  the  largest  possible  subtyping  relationship  we  could  hope  for,  while 
<  is  the  smallest.  The  question,  naturally,  is  how  close  they  come  to  each  other.  In  §6.2.12, 1  give 
a  "conditional"  completeness  theorem,  which  shows  that  the  identity  coercion  interpretation  in 
fact  captures  all  safe  subtyping  relationships,  assuming  the  language  is  equipped  with  enough 
effects  (different  forms  of  nondeterminism).  We  can  then  take  this  for  what  it  is — we  can  either 
read  it  unconditionally  by  extending  the  language  (although  in  general  the  effects  are  fairly 
high-powered),  or  we  can  learn  to  live  with  incompleteness. 


6.2  Refining  C ^ 

As  we  did  in  Chapter  4,  we  will  try  to  illustrate  most  of  the  general  concepts  with  a  simpler 
special  case,  the  positive  fragment  £  of  £.  We  include  the  impure  expression  15  from  the 
beginning,  because  as  we  will  see,  it  has  a  very  interesting  interpretation  from  the  point  of  view 
of  refinement  typing. 

6.2.1  A  refinement  "restriction"? 

Freeman  and  Pfenning  originally  introduced  refinements  as  a  way  of  verifying  additional  prop¬ 
erties  of  already  well-typed  programs,  rather  than  for  allowing  new  programs  to  be  typed.  This 
is  the  so-called  refinement  restriction,  which  is  meant  to  be  contrasted  with  the  historical  approach 
to  intersection  types.  Indeed,  one  of  the  best-known  results  about  intersection  types  is  that  if 
they  are  added  to  simply-typed  A-calculus,  suddenly  many  more  programs  become  typable:  all 
strongly  normalizing  ones  [Pottinger,  1980].  But  there  is  another  way  of  comparing  these  two 
approaches,  in  light  of  our  discussion  of  intrinsic  vs.  extrinsic  definitions  (§4.1),  and  of  typed  vs. 
untyped  languages  (§4.3.4).  The  historical  approach  to  intersections  should  not  be  thought  of 
merely  as  an  extension  to  simply-typed  A-calculus.  Intersection  type  systems  are  always  defined 
extrinsically,  with  Curry-style  rules  assigning  types  to  terms  of  the  untyped  A-calculus.  So,  we 
can  view  general  intersection  type  systems  as  a  special  kind  of  refinement  type  system,  subject 
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to  the  "refinement  restriction",  where  the  underlying  language  is  very  coarsely  typed.  Another 
way  of  putting  this  is  that  "refinement"  is  really  just  a  synonym  for  "extrinsic". 

This  is  more  than  a  play  on  words,  I  think.  We  saw  in  Chapter  4  that  many  properties 
of  C+  and  C  (their  operational  semantics,  the  separation  theorem,  etc.)  could  be  explained  in 
an  entirely  generic  way  by  reference  to  patterns,  independently  of  types.  The  same  will  be 
true  in  this  chapter.  We  define  extrinsic  refinements  of  intrinsic  £-types  through  a  notion  of 
pattern-inversion,  and  then  provide  generic,  pattern-based  rules  for  establishing  when  a  program 
is  well-refined.  Thus,  if  the  reader  wishes,  they  can  adapt  all  of  our  development  to  the  historical 
tradition  simply  by  starting  with  a  different  language  of  patterns:  all  value  patterns  introducing 
the  universal  positive  type,  and  all  continuation  patterns  introducing  the  universal  negative  type 
(cf.  §4.3.4).  A  legitimate  question  is  whether  there  is  any  point  in  having  a  fine-grained  layer  of 
intrinsic  types,  if  we  are  only  going  to  add  an  even  finer  layer  of  extrinsic  types.  I  believe  there 
is,  and  that  having  a  fine-grained  intrinsic  layer  makes  it  easier  to  reason  about  the  extrinsic 
layer — but  this  is  a  somewhat  subjective  question  that  needs  both  mathematical  and  empirical 
evidence,  outside  the  scope  of  this  thesis. 

6.2.2  Refining  types 

To  help  avoid  confusion  between  intrinsic  T-types  and  extrinsic  refinements  of  those  types,  we 
will  follow  Pfenning  [2008]  and  call  the  latter  sorts.  To  avoid  inventing  too  many  neologisms, 
though,  we  will  still  refer  to  the  process  of  verifying  sorts  of  terms  as  refinement  typing,  and 
the  partial  order  between  sorts  as  a  subtyping  relationship. 

Notation  (Sorts).  We  write  S  C  A  to  indicate  that  S  is  a  sort  of  (the  intrinsic  C-type)  A. 

A  sort  S  C  A  is  not  a  subtype  of  A  in  the  traditional  sense.  For  example,  if  S  C  A,  then  it  will 
also  be  the  case  that  ->S  C  -A,  violating  contravariance.  Likewise,  if  S'  C  A,  then  the  assertion  S 
refines  the  assertion  A,  but  so  too  does  the  denial  »S  refine  the  denial  »A. 

In  order  to  explain  how  to  define  individual  sorts  through  patterns,  we  first  need  an  analo¬ 
gous  notion  of  refinement  for  frames: 

Definition  6.2.1  (Frame  refinements).  A  frame  refinement  T  C  A  is  built  inductively  out  of  refu¬ 
tation  refinements  *S  C  »A  (or  annotated  refinements  k  :  »S  C  n  :  »A)  using  concatenation  and 
conjunction,  as  described  by  the  following  rides: 

T'i  C  Ai  E  A2  T'i  C  A  ^2  E  A 

•  C  •  T'i,  E  Ai,  A2  TEA  $iA$2EA 

For  Ai  E  A2,  ’Ll  C  Ai  and  \F2  E  A2,  we  define  the  containment  relation  \Fi  \F2  as  follozvs: 

V  Sa1  ^1  £a2  ^2  ^  Ga  ^  eA  ^2 

\F  Ea  ^  T' Ea1!A2  ^1’ ^2  ^  Ga1,A2  ^1)^2  $6^  $iA$2  'Tea'Ll  A  ^2 

We  usually  leave  the  intrinsic  frame  annotations  implicit,  writing  simply  'Fi  E  \F2-  In  this  relaxed 
notation,  the  containment  relation  rules  read  like  so: 

'T  E  \fq  $£$2  E  \fq  $£$2 

f  £4  $£fi,f2  $Efi,f2  4/  E  ’Ll  A  ^2  $  E  $i  A  $2 
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The  frame  refinement  constructors  give  us  many  different  ways  of  writing  effectively  the  same 
thing.  For  example,  if  ,5i .  T\  C  A  and  £2,  To  C  B,  we  can  refine  the  frame  u\  :  no  :  »B  by  both 

'hi  =  (rei  :  »Si,  re2  :  *£2)  A  (rei  :  «Ti,  re2  :  •To)  and  \F2  =  (« 1  :  *£i  A  rei  :  •Tf),  (re2  :  «£2  A  re2  :  *T2), 
but  these  really  represent  the  same  frame  refinement.  Formally,  we  mean  this  in  the  following 
sense: 


Observation  6.2.2.  Any  frame  refinement  defines  a  may  from  continuation  variables  re  :  »A  G  A 

to  finite  sets  of  annotations  'P(re)  =  {•£  |  re  :  •£  G  ’T},  where  all  the  •£  e  'P(re)  refine  A.  Explicitly, 
'F(re)  is  defined  as  follozvs: 


(re  :  •£)(«) 

(®i,^2)(re) 

T  (re) 
($iA$2)(re) 


{.£} 

J'Fi(re)  if  re  :  *2!  G  Ai  □  T'i 
\'F2(re)  if  re  :  »A  G  A2  □  \F2 
0 

'Ti(re)  U  ^2 (re) 


So  'hi  and  \F2  are  really  the  same  frame  refinement,  in  the  sense  that  T 1  ( re  1 )  =  T 2 ( re  1 )  = 
{•£1,  »Ti}  and  \Fi(re2)  =  T2(re2)  =  {»£2,  •To}.  The  sets  'F(re)  =  {»£i, . . . ,  •Sn}  should  be  thought 
of  conjunctively,  as  asserting  that  re  accepts  each  of  the  £,.  Beware,  though,  that  because  of  the 
contravariant  reading  of  •£,  this  is  the  same  as  saying  that  re  accepts  the  union  sort  £1  U  •  •  •  U  Sn. 

Just  as  intrinsic  types  were  defined  by  their  patterns,  extrinsic  sorts  are  defined  by  pattern- 
inversion.  To  define  the  sort  £  C  A  one  specifies,  for  every  pattern  p  ::  (A  IF  A),  a  finite  set 
Ip  :  £J  of  frame  refinements  of  A.  Intuitively,  the  set  \p  :  S}  is  interpreted  disjunctively,  as  the 
different  possible  refinements  of  the  continuation  variables  bound  by  the  pattern,  assuming  it 
has  the  given  sort.  A  few  examples  should  give  an  intuition  for  the  idea: 


[(rei,re2)  :  -■£  <8>  ->T]] 
[re  :  ->£  fl  -iT]| 
[re  :  -i£  U  -iTj 
[ini  rei  :  ->£  0  -iTj 
I(«i,  « 2)  :  ~'£<8)  J-l 
[re  :  T] 


{(rei  :  •£,  re2  :  *T)} 
{(re  :  •£  A  re  :  •T)} 
{(re  :  •£),  (re  :  •T)} 
{(rei  :  •£)} 

0 

{T} 


These  examples  are  calculated  from  the  following  general  definitions,  as  the  reader  can  verify: 

Intersections  and  unions 


[p  :  £  n  T]  =  {($iA$2)|$iG[p:5],$2G[p:T]} 

[p  :  £  U  T]  =  [p  :  £]  U  [p  :  TJ 
[p:T]  =  {T} 

[p  :  X]  =  0 


Products,  sums,  negations 
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TCi 


SCi  T E  A 
SnTOA 


SEA  TEA 
.LEA  SETEA 

SEA  5CA  TEB  5Cyl  TEB 
^SEAA  S®TEA&B  s®tea®b 


Figure  6.1:  Some  sort  constructors 


^  Sj 

l(jPi,P2)  ■  S®  T] 
[inlp:  S®Tj 
[inrp  :  S  ©  T] 


{(«  :  •5')l 

{(^i,  tf2)  |  ^1  G  {pi  :  S],  ^2  G  lp2  :  TJ} 

Ip  ■  si 
lp-Tj 


In  order  for  these  general  definitions  of  pattern-inversion  to  make  sense,  we  have  to  respect  some 
implicit  conventions  on  the  formation  of  sorts.  For  example,  the  binary  union  and  intersection 
can  only  be  formed  when  both  S  and  T  refine  the  same  type  (i.e.,  the  "refinement  restriction"),  or 
else  we  might  not  be  able  to  invert  p  at  either  S  or  T.  These  implicit  conventions  are  described 
in  Figure  6.1.  Note,  though,  that  the  refinement  restriction  is  trivially  satisfied  if  we  view 
all  (positive)  sorts  as  refining  the  same  (positive)  universal  type.  In  addition  to  these  generic 
refinement  constructors,  we  can  define  some  more  interesting  refinements  of  datatypes.  Recall 
we  defined  booleans  (B),  natural  numbers  (N),  and  a  paradoxical  domain  (D  =  N  ©  ->D)  in 
§4.2.1.  We  can  refine  these  types  to  isolate  more  interesting  properties: 

Sorts  of  booleans  (5  E  B) 


[tt  :  T]  =  {•}  [tt  :  F]  =  0 
Iff  :  T]  =  0  Iff  :  F]  =  {•} 

Sorts  of  natural  numbers  (S  E  N) 

[z  :  Ne]  =  {•}  [z  :  N0]  =  0  [z  :  Nnzj  =  0 
[s  V  ■■  Ne]  =  Ip  :  N0]  [s  p  :  Nc]  =  [p  :  Ne]  [sp  :  Nn2]  =  {p  :  T] 

Sorts  of  the  paradoxical  domain  (S  E  D) 

Idnp  :  De]j  =  \p  :  Ne]  [dnp  :  D*]  =  0 
[dkp  :  De]  =  |[p  :  ^  De]  Jdkp  :  D,]  =  \p  :  ^  T] 

These  different  sorts  have  intuitive  descriptions: 

•  T:  the  one-point  subset  of  the  booleans  containing  only  "true" 
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•  F:  the  one-point  subset  of  the  booleans  containing  only  "false" 

•  Ne,  N0,  Nnz:  respectively,  the  even,  odd,  and  positive  natural  numbers 

•  De:  the  subset  of  D  obtained  by  restricting  natural  numbers  to  be  even,  and  propagating 
the  restriction  through  to  continuations 

•  D*:  the  continuation  subset  of  D 

It  will  be  useful  to  isolate  a  degenerate  case  of  pattern-inversion. 

Definition  6.2.3  (Absurd  patterns).  We  say  that  an  A-pattern  p  is  absurd  at  sort  S  C  A  if  \p  :  S']  =  0. 
For  example  sz  is  absurd  at  sort  Ne,  and  dn  z  is  absurd  at  D*.  If  p  is  absurd  at  S,  zve  say  that  S  refutes 
P- 

6.2.3  Refining  terms 

Just  as  the  terms  of  C+  were  defined  generically  by  reference  to  value  patterns — without  men¬ 
tioning  any  particular  types — refinement  typing  of  C+  is  defined  generically  by  reference  to  value 
pattern-inversion.  The  refinement  type  system  consists  of  four  judgments: 

5h  V  :  S  Sh  K  :mS  ~  h  <r  :  ^  Sh  E:  / 

In  general,  the  refinement  typing  judgment  takes  the  form  5  h  t  :  J,  where  t  ::  (T  b  J)  is  an 
C  -term,  J/  refines  J,  and  E  refines  the  context  T.  Refinement  of  contexts  is  defined  as  you 
would  expect: 

E  C  T 

•  c  •  cr,A 

We  write  T'i  G  E,  if  either  T'i  G  S  or  $1  G  Again,  any  context  refinement  TCP 
uniquely  defines  a  map  from  continuation  variables  k  :  »A  G  T  to  sets  of  annotations  E(k)  = 
{•S  \  k:»S  e  H}. 

We  now  explain  how  to  establish  each  of  the  refinement  typing  judgments. 

•  (V  :  S):  A  value  p[cr\  has  sort  S  if  there  is  some  T  in  Ip  :  S']  such  that  a  satisfies  all  of  T: 

T'  G  Ip  :  S]  E  h  a  :  T' 

s  h  pH  :  s 

Note  that  implicit  in  this  rule  are  the  conditions: 

p  ::  (A  lb  A)  a  ::  (T  h  A) 

-cr  T'CA  SCA 

But  it  is  safe  to  leave  these  conditions  implicit,  since  they  are  the  only  sensical  way  to 
interpret  the  rule  (by  our  definition  of  patterns  and  pattern-inversion). 

•  ( K  :  •S):  A  continuation  K  accepts  sort  S  C  A  if  for  every  A-pattern  p,  in  every  possible 
context  T  G  \yp  :  S],  the  expression  K(p)  is  well-sorted: 
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G  b  :  -SI  — >  H,  F  A»  :  / 
5hA':.S 


Again,  this  rule  leaves  implicit  various  conditions  that  are  forced  for  the  rule  to  make  sense. 
Notice  that  the  set  \p  :  S}  need  not  be  a  singleton  (e.g.,  if  S’  is  a  union),  and  so  checking 
the  continuation  could  involve  checking  the  same  branch  K(p)  multiple  times. 

•  (cr  :  'I'):  A  substitution  a  satisfies  all  of  'k  C  A  if  for  every  continuation  variable  k  :  «,4  G  A, 
for  every  hypothesis  »S  G  \k(«;),  the  continuation  a(n)  accepts  sort  S.  We  establish  this 
with  rules  that  follow  the  structure  of  \k: 

E  h  a  :  \ki  H  h  (7  :  $2  S  h  a\  :  \ki  S  h  02  :  ^2 

H  F  <7  :  T  B:  F  <7  :  \ki  A  ^2  Bh  ■  :  ■  E  h  (<ti,  no)  :  (\ki,  ^2) 

Notice  that  when  'k  is  a  meet  of  two  frame  refinements  \k  1  A  \k2,  we  must  check  the  same 
substitution — and  ultimately  the  same  set  of  continuations — against  multiple  sorts. 

Proposition  6.2.4.  The  follozving  rule  for  type  checking  substitutions  is  sound  and  complete: 

k  :  »S  G  \k  — *  H  h  a(n)  :  »S 
E  \~  a  :  <k 

•  (E  :  /):  An  expression  k  V  is  well-sorted  in  context  E  if  there  is  at  least  some  hypothesis 
•S  G  E(k)  such  that  V  has  sort  S: 


k  :  •iS*  G  5  EhV  :  S 
BhKk:/ 

For  reference,  we  include  the  complete  set  of  refinement  typing  rules  for  C+  in  Figure  6.2.  One 
note  about  interpreting  these  rules:  Recall  that  in  general  we  allow  C+  continuations  to  be  defined 
by  partial  recursive  functions  on  patterns,  and  C  ;  terms  to  be  built  up  using  arbitrary  mutual 
reference.  A  typing  derivation  for  such  a  term  mirrors  its  structure,  i.e.,  can  be  non- well-founded. 
By  this  convention,  the  non-terminating  pseudo-expression  Q  is  necessarily  well-sorted.  On  the 
other  hand,  we  have  a  choice  as  to  how  to  treat  the  aborting  expression  FS.  We  will  find  it  far 
more  interesting  to  declare  that  15  is  not  well-sorted.  Thus,  we  can  see  refinement  checking  as  a 
way  of  guaranteeing  that  a  term  does  not  make  essential  use  of  O  (even  though  it  may  mention 
it  syntactically).  Formally,  we  will  eventually  establish  a  variant  of  the  famous  slogan, 

well-sorted  programs  don't  go  FS 

We  consider  some  example  C+  programs,  old  and  new,  to  illustrate  refinement  typing. 

Example  6.2.5.  Recall  the  (B  <g>  B)  <g)  -iB-continuation  and*  from  Example  4.2.9: 


and* 

((tt,tt), 

k)  ~- 

=  K 

TT 

and* 

((tt,ff), 

«)  = 

=  K 

FF 

and* 

((ff.tt), 

«)  = 

=  K 

FF 

and* 

((ff.ff), 

«)  = 

=  K 

FF 
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(See  body  of  §6.2.2  and  §6.2.3  for  implicit  refinement  restrictions.) 

Frame  refinements 

::=  k  :  •S'  |  •  |  |  T  |  A  <F2 

Context  refinements 
HCT  ::=  ■  |  E,  VF 

Refinement  typing  judgments 
S  F  V  :  S  V  has  sort  S 

E  F  K  :  •S'  K  accepts  sort  S 

Hh(j:$  <t  satisfies  'F 

Hh  E  :  /  E  is  well-sorted 


<F  G  Ip  :  Sj  HI-  cr  :  <F 

s  b  pH  :  s 


^  €  b  :  Si  — >  E,  \F  F  K(p )  :  / 
HhKF.S 


S  F  cr  :  T 


E  h  a  :  \Fi  E  F  cr  :  rF 2 

HF(t:  \Fi  A  \F2  H  F  •  :  • 

•5  €  E(k)  HFF:5 
HF/tF:/ 


S  F  <7i  :  *Fi  S  F  <r2  :  \F2 
S  F  ( <7i, cr2)  :  ('Fi,^) 


HFS1:  /  E\/U  :  / 


Figure  6.2:  Refinement  typing  of  C-(- 
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We  can  assign  and*  many  precise  refinement  types.  For  example,  and*  accepts  sorts  (T0T)0-iT, 
(F  0  F)  0  -iF,  etc.,  but  not  (T  0  T)  0  -iF,  (F  0  F)  0  -iT,  etc.  Indeed,  we  can  fully  capture  the 
truth  table  by  noting  that  and*  accepts  the  union  sort 

(T  0  T)  0  -iT 
U  (T0F)0-.F 
U  (F  0  T)  <8)  -.F 
U  (F  <8)  F)  <8)  -iF 

which  can  also  be  expressed  slightly  more  concisely  as: 

(T  <g>  T)  <g> -.T  U  (F0T)0-iF  U  (T  0  F)  0 -iF 

(Warning:  T  C  B  is  something  very  different  from  T!)  The  reader  can  try  working  through  the 
refinement  typing  derivations  that  establish  these  respective  facts. 

Saying  that  a  continuation  accepts  a  union  is  perhaps  a  bit  unnatural.  In  the  refinement  type 
system  for  full  C,  we  can  talk  about  the  isomorphic  definition  of  and*  as  a  negative  value  (of 
type  B  <g)  B  — >•  j B),  and  say  that  it  has  an  intersection  sort: 

(T  (g>  T)  — +  jT  n  (F  0  T)  — >■  |F  n  (T  <8)  F)  -> 

For  now,  the  reader  can  take  this  as  an  intuition  for  understanding  the  sort  of  and* .  ■ 


Example  6.2.6.  We  can  similarly  refine  the  type  of  the  (N  0  N)  <8>  -iN-continuations  plus*  and 
tunes*  from  Example  4.2.11.  For  example,  plus*  accepts  the  union  sort 

(Ne  <g>  Ne)  <8)  -nNe  U  (Ne  0  N0)  0  -iN0  U  (N0  0  Ne)  0  -iN0  U  (N0  0  Nc)  0  nNe 

while  tunes*  accepts 

(Ne  0  T)  0  -iNe  U  (T  0  Ne)  0  -.Ne  U  (Nc  0  N0)  0  -.N0 

In  contrast  to  the  example  above,  these  refinements  of  course  do  not  come  close  to  characterizing 
the  behavior  of  plus*  and  times*.  They  are  only  a  partial  specification.  ■ 


Example  6.2.7.  Consider  the  following  definition  of  a  D-continuation  appO,  which  attempts  to 
apply  its  argument  to  zero: 


appO  (dn  n)  =  FS 
appO  (dk/c)  =  k  (DN  Z) 

A  D-value  can  be  either  a  natural  number  or  a  continuation,  and  in  the  former  case  there  is  no 
sensible  action  for  appO  to  take — and  so  its  most  sensible  action  is  to  simply  abort.  This  style 
of  definition  is  typical  in  ordinary  programming:  when  the  type  of  a  function  is  too  coarse 
to  capture  its  intended  use,  an  extra  case  is  added  raising  an  exception.  This  helps  avoid  the 
dreaded  SML  "Warning:  match  nonexhaustive"  compiler  message,  but  unfortunately,  it  gives  no 
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static  guarantees  that  the  function  really  is  being  used  as  intended,  i.e.,  that  the  exception  will 
never  be  raised.  As  such,  this  situation  provides  a  classic  application  for  refinement  types. 

In  our  case,  we  can  indeed  verify  that  appO  accepts  the  refinement  D*  C  D,  because  D*  rules 
out  the  dnn  branch.  Observe  that  it  will  not  accept  De  C  D,  despite  the  fact  that  Z  is  even, 
because  that  refinement  does  not  rule  out  the  dnn  branch  where  the  ill-sorted  expression  O  is 
executed.  ■ 


6.2.4  Refining  equality,  identity  and  composition 

In  Chapter  4  we  explained  how  to  define  syntactic  equality  for  £ 1 ,  how  to  compose  two  terms, 
and  how  to  build  terms  representing  the  identities  for  these  operations.  We  want  to  know  that 
these  notions  behave  reasonably  with  respect  to  refinement  typing. 

Proposition  6.2.8  (Refinement  respects  equality).  If  5  h  t  :  J  and  t  =  t'  then  H  h  t1  :  J . 

Proof.  Immediate  from  the  definition  of  the  refinement  typing  rules.  □ 

Proposition  6.2.9  (Identity  preserves  refinement). 

1.  IfK:mS€a  then  Sh  IdK  :  »S 

2.  If  £E  then  H  b  IdrA]  :  \k  (where  C  A) 

Proposition  6.2.10  (Composition  preserves  refinement). 

1.  If  E  h  K  :  «S  and  E  h  V  :  S  then  E  h  K  •  V  :  / 

2.  If  H  h  a  :  \k  and  H(\k)  h  t  :  J  then  H  b  t[a\  :  f 

Proof.  The  proofs  of  these  statements  precisely  mirror  the  construction  of  the  identity  and  com¬ 
position  terms,  respectively.  Again,  it  is  important  that  we  allow  typing  derivations  to  be  as 
non-well-founded  as  terms.  □ 

Operationally,  we  will  be  more  interested  in  the  type  preservation  theorem  for  the  environment 
semantics  (i.e.,  n-ary  composition),  rather  than  the  binary  version  above.  Proposition  6.2.9  is 
likewise  just  a  special  case — reflexivity — of  the  identity  coercion  interpretation  of  subtyping. 
Nonetheless,  these  simple  properties  give  us  some  confidence  that  the  rules  of  refinement  typing 
make  sense. 

6.2.5  Refining  complex  hypotheses 

We  can  refine  complex  value  hypotheses  x  :  A  G  A  with  hypotheses  x  :  S  £  T.  As  with  T(k), 
the  set  'k(x)  =  {Si, . . . ,  Sn}  is  interpreted  conjunctively.  Because  of  the  covariant  reading  of  the 
assertion  S,  this  is  the  same  as  asserting  an  intersection  x  :  Si  n  ■  ■  ■  n  Sn. 

We  type  check  a  case-analysis  on  a  value  variable  by  applying  pattern-inversion: 

fri  e  Ip  ■  Si]  •  •  •  ^e[p:  Sn]  — >  s(^i  a  •  •  •  a  frn)  b  tp  .  f 

E(x  :  Si  A  •  •  •  A  x  :  Sn)  b  case  x  of  p  e- >  tp  :  f 
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It  is  important  that  we  combine  all  the  information  from  the  different  refinement  assumptions 
about  x,  in  order  to  have  the  strongest  possible  reasoning  principle.  For  example,  we  can  verify 
that  E(x  :  Ne  A  x  :  N0)  F  case  x  of  p  i->  FS  :  /,  because  every  N-pattern  n  is  either  absurd  at  Ne 
or  absurd  at  N0. 

Again,  we  should  check  that  the  various  operations  on  complex  hypotheses  respect  refine¬ 
ment  typing. 

Proposition  6.2.11  (Pattern  substitution  preserves  refinement).  If  E(x  :  Si  A  ■  ■  ■  A  x  :  Sn )  b  t  :  J 

and  \Fi  <E  Ip  :  SiJ, . . .  ,\Fn  G  \p  :  Snj  then  H(^i  A  •  •  •  A  \Fn)  h  t\p/x]  :  J. 

Proposition  6.2.12  (Value  identity  preserves  refinement).  If  x  :  5  G  S  then  H  h  Idx  :  5. 

Proof.  Immediate.  □ 

6.2.6  Subtyping:  the  identity  coercion  interpretation 

In  this  section  we  introduce  the  identity  coercion  interpretation  of  subtyping  and  study  some  of 
its  "internal"  properties. 

Definition  6.2.13  (Subtyping).  For  8.  T  C  A,  we  say  that  S  is  a  subtype  ofT(S  <a  T)  if  there  is  a 
derivation  of  k  :  *T  F  IdK  :  *5.  For  \P,  \F'  C  A,  we  say  that  ^  is  a  subframe  of^f'  (written  ^  <a  'FO 
if  there  is  a  derivation  of  'F  F  Zd[A]  :  'F'. 

We  often  omit  the  subscript  and  write  S  <  T,  when  the  intrinsic  type  A  can  be  inferred  from 
context  or  is  unimportant.  We  write  S  =  T  if  both  S  <  T  and  T  <  S,  and  S  f  T  if  there  is  no 
derivation  of  S  <  T.  We  adopt  the  analogous  conventions  for  the  subframe  relationship. 

Intuitively,  S  <T  says  that  we  can  uniformly  convert  any  T-continuation  into  an  5-continuation 
by  precomposing  it  with  the  identity — hence  the  identity  coercion  interpretation.  We  could  equiv¬ 
alently  define  subtyping  in  terms  of  a  number  of  different  kinds  of  identity  coercions: 

Proposition  6.2.14.  The  following  are  equivalent: 

1.  S<T 

2.  x  :  S  \~  Idx  :  T 

3.  x  :  5,  k  :  •T  F  k  Idx  :  / 

4.  For  all  A-patterns  p,  for  every  'F  e  \p  :  Sj  there  exists  \F'  e  \p  :  T]  such  that  <F  <  'F' 

Proof.  Immediate  by  expanding  the  definition  of  the  identity  terms  and  the  typing  rules.  □ 
Proposition  6.2.15.  Both  <n  arid  <A  are  reflexive  and  transitive. 

Proof.  These  properties  are  direct  results  of  the  fact  that  identity  and  composition  preserve 
refinement,  combined  with  the  unit  laws  (Lemma  4.2.15).  For  example,  we  can  compose  any 
two  derivations  k  :  *82  F  IdK  :  »5 1  and  k  :  •5.->  F  IdK  :  *82  to  obtain  n  :  *53  F  IdK[IdK]  :  »8\ . 
But  IdK[IdK]  =  IdK,  and  so  k  :  •S:>>  F  IdK  :  •Sl  (i.e.,  S\  <  S3)  because  refinement  typing  respects 
definitional  equality.  O 

Lemma  6.2.16  (Term  inclusion/ reverse  inclusion). 
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•  IfEhV  :  S  and  S  <T  then  S  b  V  :  T 

•  IfZh  K  :*T  and  S  <T  then  Eh  K  :  *S 

•  If  E  h  a  :  \k  and  \k  <  ik'  then  E  h  a  :  'I'' 

•  If  S('P')  h  E  :  /  and  V  then  E(tf)  h  E  :  / 

Proof.  All  immediate  consequences  of  composition  preservation  and  the  unit  laws.  □ 

Proposition  6.2.17.  is  a  distributive  lattice  with  meet/join  operations  n,  U,  T,  X. 

Proposition  6.2.18.  ©,<8),  and  b  obey  the  usual  covariant  and  contravariant  laws: 

1.  If  S\  <  T\  and  S2  <  T2  then  S\  <g)  S2  <  T\  ®  T2 

2.  Si  0  S2  <  Ti  ©  T2  iff  S\  <  T\  and  S2  <  T2 

3.  h  S<^TiffT<S 

Proposition  6.2.19.  Intersections  and  unions  distribute  through  sums  and  products,  as  folloivs: 

1.  S®(7inT2)  =  (50Ti)n(5®T2) 

2.  50(TiUT2)  =  (S®T1)U(S®T2) 

3. 

4.  S0(TinT2)  =  (S0ri)n(s©T2) 

5.  50(TiUT2)  =  (50T1)U(5©T2) 

Proof  (of  Props.  6.2.17-6.2.19).  These  are  all  trivial  consequences  of  the  pattern-inversion  rules. 

It  is  worth  pointing  out  that  although  these  properties  are  typical,  intuitive,  and  easy  for  us  to 
derive,  the  fact  that  we  can  derive  them  within  the  language  is  not  so  typical.  In  particular,  the 
reason  why  it  is  easy  to  derive  distributivity  laws  such  as  ( S  0  Tf)  n  (5  0  T2)  <  S  0  (Ti  n  T2)  is 
because  the  identity  coercions  are  defined  by  pattern-matching  and  checked  by  pattern-inversion. 
With  the  standard  A-calculus  introduction  and  elimination  rules  for  sums  and  intersections,  for 
example,  we  cannot  derive  x  :  (S  0  Ti)  n  (S  0  T2)  b  rj(x)  :  S  0  (Tf  n  T2)  (unless  we  already 
assume  a  prior  notion  of  subtyping  to  coerce  x  to  type  S  0  (Ti  n  T2),  but  that  is  cheating).  We 
can  derive  x  :  (S  0  T 1 )  n  (S  ®  T2)  b  q(x)  :  S  0  (T 1  n  T2)  in  the  standard  presentation  of  A-calculus 
with  products  and  intersections,  but  not  when  the  projection  elimination  rules  for  products  are 
replaced  by  the  less  standard  but  not  uncommon  splitting  construct  "let  (x,  y )  =  e\  in  e2".  In 
other  words,  if  we  want  to  define  subtyping  syntactically  as  we  have,  it  is  important  that  we 
have  a  good  notion  of  syntax — and  conversely,  the  identity  coercion  interpretation  provides  a 
test  that  we  really  do  have  a  good  notion. 

In  addition  to  these  familiar  distributivity  properties  of  intersections  and  unions  through 
products  and  sums,  we  can  consider  the  the  distributivity  and  non-distributivity  properties  of 
intersections /unions  through  positive  negation. 
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Proposition  6.2.20. 

1.  (i)  T  =  X  and  (ii)  ^Sn^T^^(SuT) 

2.  (i)  T  T  £  X  and  (ii)  ^(S  n  T)  £  *  S  U  ^  T  unless  S  <T  or  T  <  S 

Proof.  l(i)  reduces  to  checking  T  b  IdK  :  »X,  which  holds  vacuously  since  X  refutes  every  pattern. 
l(ii)  reduces  to  checking  k  :  »S  A  k  :  •T  b  IdK  :  *5  U  T.  Since  \p  :  S  U  TJ  =  \p  :  S}  U  \p  :  T],  this 
reduces  to  checking  that  IdK  accepts  both  S  and  T,  and  that  holds  by  identity  preservation. 

2(i)  fails  because  [ ]p  :  Xj  is  empty  but  \p  :  b  TJ  is  not.  2(ii)  requires  that  either  k  :  •S'  n  T  b 
IdK  :  »S  or  k  :  »S  n  T  b  IdK  :  *T,  but  by  definition  these  hold  if  and  only  if  5  <  T  or  T  <  S.  □ 

These  are  perhaps  the  most  interesting  laws  and  non-laws,  and  we  should  try  to  develop  an 
intuition  for  them.  As  one  easy  consequence,  we  derive  laws  and  non-laws  for  double-negated 
sorts: 

Corollary  6.2.21  (Laws  of  double-negation). 

1.  T^S  iff  S  <T 

2.  T  s  n  ^  A  T  £  *  A(s  n  T)  unless  S  <T  or  T  <  S 

3.  ^  ^  (5  U  T)  j£zti±iS'U±'±'T  unless  S  <T  or  T  <  S 

Proof  (1)  is  immediate  by  Prop.  6.2.18(3),  while  (2)  and  (3)  reduce  to  Prop.  6.2.20  as  follows: 

*  *  5  n  A  A  T  =  ^(b  5  U  A  T)  £  A  b(S  n  T) 
3^(SuT)E^5n^T)^^5U^3T 

m 


6.2.7  Subtyping:  axiomatization 

Although  the  identity  coercion  interpretation  itself  gives  a  generic  axiomatization  of  subtyping — 
by  reducing  it  to  the  rules  of  refinement  typing — it  is  instructive  to  give  a  more  direct  axioma¬ 
tization  by  expanding  the  typing  rules. 

We  begin  by  unrolling  the  meaning  of  S  <  T  by  one  step: 

— >  $,K:.TbK(p[MM]):/ 

k  :  «T  b  IdK  :  »S 

and  likewise  unrolling  the  meaning  of  ’T  <  T',  using  Proposition  6.2.4: 

•T  e  V'(k)  — *  \k  b  IdK  :  »T 

T  b  Id  A  ■  ’k/ 

Can  we  go  further?  Well,  to  derive  T .  k  :  »T  b  k  (p[Id^\)  :  /,  we  first  have  to  find  some 
$'  6  [p  :  T],  and  then  show  that  $,K:«Tb  Idtpi  :  T'.  The  hypothesis  about  n  is  irrelevant,  so 
this  reduces  to  showing  that  for  all  n!  and  k!  :  •T'  £  T',  we  can  derive  T'  b  IdK>  :  •T/.  That  is, 
the  following  inference  is  sound  and  complete: 
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3^'  6[p:T]  V«T'  £  'T'(k')  h  IdK>  :  •T' 


^/,k  :  «T  h  k  (p[M[p]])  :  / 

And  what  about  deriving  ^  b  ldK  :  »T?  We  can  unroll  the  definitions  to  see  that  the  following 
inference  is  sound  and  complete: 

V'L'  £  |p  :  21  3.5  £  (k)  $,K:.ShK  (p[JdM])  :  / 

b  IdK  :  *T 

In  other  words,  we  have  succeeded  in  reducing  these  two  auxiliary  typing  judgments  to  each 
other! 

Theorem  6.2.22  (Subtyping  axiomatization).  We  define  two  relations  \k  <p  T  and  S  <K  \b,  where 

p  ::  (A  lb  A)  k  :  »B  £  A 
T\ZA  $:A  S\zb 

by  the  following  coinductive  rules: 

£  |p  :  T]  V«5  £  tf'(«)  S  <K  Vtf'  £  | p  :  SJ  3.T  £  (re)  ’L/  <p  T 
^  <PT  5'  <K  3/ 

T/ien  f/ze  subtyping/subframing  relationships  are  soundly  and  completely  axiomatized  asfolloivs: 

V’b  £  |p  :  5J  V  <pT  V«5  £  ^'(k)  5  <K  ^ 

S  <T  vp  <  vp' 

Proof.  A  restatement  of  the  previous  paragraph.  ]J§ 

6.2.8  Reconstructing  the  value/e  valuation  context  restrictions 

The  failed  distributivity  laws  for  double-negation  are  closely  related  to  the  value  and  evaluation 
context  restrictions  discovered  by  Davies  and  Pfenning  [2000]  and  by  Dunfield  and  Pfenning 
[2004],  We  can  see  this  pretty  directly  if  we  recall  (§4.4)  that  one  way  of  understanding  arbitrary 
terms  of  call-by-value  A-calculus  is  as  C  1  expressions  with  a  distinguished  continuation  variable 
k  :  «/l.  Such  expressions  are  in  one-to-one  correspondence  with  C  1  values  of  double-negated 
type  -i-iA  Suppose  then  that  we  have  a  single  value  that  can  be  assigned  two  double-negated 
sorts,  V  :  -i-i  5  and  V  :  -i-i T,  where  S.  T  C  .4.  We  can  validly  assign  it  an  intersection,  V  : 
-i-i 5  H  -i-i T.  But  we  cannot  go  on  to  conclude  V  :  -i-i (S  n  T ),  because  of  the  failure  of  the 
principle  -i-i S  n  -i-i T  <  ->-i (S  fi  T).  On  the  other  hand,  there  is  no  value  restriction  on  union 
introduction,  because  from  either  V  :  -i-i S  or  V  :  -!-■  T  we  can  correctly  conclude  V  :  -i-i(Sl  U  T), 
applying  the  valid  subtyping  law  -i-i S  U  -i-i T  <  -i-i (S  U  T ). 

Dually,  an  arbitrary  (not  necessarily  evaluation)  context  for  an  ML  term  can  be  seen  as  an  C+ 
continuation  accepting  a  double-negated  type.  Suppose  we  have  a  single  continuation  accepting 
two  double-negated  sorts,  K  :  •-1-1S  and  K  :  •-1-1T.  This  immediately  implies  that  it  accepts 
the  union  K  :  •-1-1S'  U  -i-i T,  but  this  does  not  imply  K  :  •-i-i(S'  U  T),  because  of  the  failure  of 
-i-i(S'UT)  <  -i-iS'U-i-iT.  On  the  other  hand,  from  either  K  :  • — ■— ■S'  or  K  :  «-i-i T  we  can  correctly 
conclude  K  :  »-i-i(5  n  T),  so  there  is  no  restriction  on  intersection  elimination. 

It  is  also  possible  to  understand  the  value  restriction  on  its  own,  without  resorting  to  sub¬ 
typing,  by  looking  directly  at  potential  typing  rules  for  expressions. 
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Notation.  We  ivrite  S  F  E  4-  5  as  an  abbreviation  for  S,  k  :  •S'  F  E  :  /.  (Note  that  the  typing  judgment 
implicitly  binds  k.  If  we  want  to  be  more  explicit,  we  can  write  S  F  k.E  4-  S.) 

Consider  the  following  rule  for  giving  an  expression  an  intersection  type: 

El -  E  +  S  SI -  E  +  T 
ZPE+SnT  * 

We  write  because  the  rule  is  not  admissible.  If  it  were,  we  would  show  this  by  examining 
every  potential  use  of  the  continuation  variable  k  in  E,  and  establishing  that  the  use  is  well- 
sorted  under  assumption  that  k  :  •S  n  T.  Well,  suppose  there  is  a  use  k  V.  From  the  first 
premise  of  the  rule,  we  know  that  V  :  S  in  a  context  containing  at  least  k  :  »S  (and  S,  and 
possibly  other  assumptions  introduced  during  the  course  of  the  refinement  typing  derivation). 
From  the  second  premise,  we  know  that  V  :  T  in  a  context  containing  at  least  k  :  •T.  But  this 
does  not  justify  concluding  that  V  :  S  C\T  in  a  context  containing  n  :  »S  D  T  (and  whatever  other 
relevant  hypotheses).  One  potentially  useful  observation  we  can  make  here,  though,  is  that  this 
step  would  be  justified  if  the  expression  were  linear  (or  affine),  because  then  k  could  not  occur  in 
V. 

We  can  give  a  similar  direct  explanation  of  the  evaluation  context  restriction,  but  it  requires 
the  refinement  type  system  for  full  C. 

6.2.9  The  environment  semantics  and  type  safety 

In  this  section  we  consider  the  traditional  (extrinsic)  type  safety  properties  for  £+,  with  respect 
to  its  operational  semantics  and  refinement  type  system.  The  reader  may  want  to  review  §4.2.12 
to  recall  the  definition  of  the  environment  semantics. 

Definition  6.2.23  (Environment  typing).  Let  7  be  a  T -environment.  For  any  context  refinement  E  C  F, 
we  check  that  7  :  E  as  follows: 

7  :  5  E  b  a  : 

emp  :  •  bind(7,  a)  :  (E,  \F) 

Proposition  6.2.24.  If  7  :  E  and  mS  €  E(k)  then  E  F  lookup(7,  k)  :  »S. 

Proof.  Immediate  (the  extrinsic  analogue  of  Proposition  4.2.26).  □ 

Definition  6.2.25  (Program  typing).  The  program  (7  |  E)  is  well-sorted  if  there  is  some  S  such  that 
7  :  E  and  5  h  E  :  /.  The  program  (7  |  K  \  V)  is  well-sorted  if  there  are  some  E  and  S  such  that  7  :  E 
and  Eh  K  :  »S  and  and  5  F  V  :  S. 

We  can  now  state  and  prove  the  progress  and  preservation  lemmas  in  their  usual,  extrinsic  form. 

Lemma  6.2.26  (Progress).  If  P  is  a  zvell-sorted,  there  exists  a  P'  such  that  P  P'. 

Proof.  Immediate,  as  in  the  proof  of  Lemma  4.2.30,  with  well-sortedness  taking  the  place  of  the 
purity  assumption.  □ 

Lemma  6.2.27  (Preservation).  If  P  is  well-sorted  and  P  ^  P'  then  P'  is  zvell-sorted. 
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Proof.  The  transition  is  either  by  lookup-1"  or  bind/call+.  If  the  former,  we  know  that  P  =  (7  |  k  V) 
and  P'  =  (7  |  lookup(7,  k)  \  V).  From  the  assumption  that  P  is  well-sorted,  there  exists  a  context 
refinement  S  such  that  7  :  E  and  E  F  k  V  :  Inverting  the  latter  typing  derivation,  there  must 

exist  »S  e  5 (k)  such  that  E  F  V  :  S.  By  Proposition  6.2.24,  E  b  lookup(7,  n)  :  •S'.  Hence  P'  is 
well-sorted. 

If  the  transition  is  by  bind/call+,  we  know  that  P  =  (7  |  K  \  p[a ])  and  P'  =  (bind(7,  a)  \  K(p)). 
From  the  assumption  that  P  is  well-sorted,  there  exists  a  context  refinement  E  such  that  7  :  E 
and  sort  S  such  that  E  F  K  :  •£  and  E  F  p[a\  :  S.  Inverting  the  value  typing  derivation,  there 
must  exist  T  e  \j>  :  S']  such  that  5  F  a  :  \F,  which  implies  bind (7,0-)  :  (E.  T).  Inverting  the 
continuation  typing  derivation,  we  know  that  E,  \F  F  I\  (p)  :  /.  Hence  P'  is  is  well-sorted.  □ 

Corollary  6.2.28  (Type  safety).  If  P  is  well-sorted,  then  P  1).  R  for  some  well-sorted  R. 

Because  13  is  ill-sorted  by  convention,  we  can  now  confidently  declare, 

well-sorted  programs  don't  go  FS 

6.2.10  Subtyping:  the  no-counterexamples  interpretation 

We  saw  that  the  identity  coercion  interpretation  of  subtyping  gives  a  synthetic  reconstruction  of 
the  value  and  evaluation  context  restrictions  for  intersections  and  unions.  On  the  other  hand, 
observing  that  the  unrestricted  laws  are  not  derivable  is  not  the  same  as  saying  they  are  unsafe, 
which  was  in  fact  what  Davies  and  Pfenning  [2000]  and  Dunfield  and  Pfenning  [2004]  noticed 
about  unrestricted  intersection  introduction  and  union  elimination.  In  this  section,  we  consider 
another  possible  interpretation  of  subtyping,  aiming  to  deal  directly  with  these  safety  violations. 

Definition  6.2.29  (Safety).  Let  E  ::  (T,A  F  #)  and  a  ::  (T  F  A).  For  any  Y-environment  7,  we  say 
that  7  |=  E  _L  a  (E  and  a  are  safe  for  each  other  in  7)  if  (bind (7,  a)  \  E)  ]K  FS.  We  write  7  |=  V  _L  K  if 
(7  |  K  |  V)  If- 13.  We  write  E  _L  a  and  FI  K  if  these  relationships  hold  relative  to  emp. 

Safety  is  an  orthogonality  relation,  in  the  sense  of  Mellies  and  Vouillon  [2005].  It  is  interesting  in 
its  own  right  and  we  will  describe  some  of  its  properties  later,  but  for  now  we  are  content  to 
use  it  to  give  another  interpretation  of  subtyping.2 

Definition  6.2.30  (Safe  subtyping).  Let  'Fi,T,2  E  A.  We  say  that  Tr  is  a  safe  subframe  of 
(written  T'i  ^  ^2)  if  for  all  substitutions  a  ::  (■  F  A)  and  expressions  E  ::  (A  F  #), 

if  ■  F  a  :  T'i  and  FT:/  then  E  lu 

For  two  sorts  S,T  C  A  of  a  positive  type,  we  say  that  S  is  a  safe  subtype  of  T  (written  S  ^T)  if  for 
all  values  V  ::  (■  F  A)  and  continuations  K  ::  (•  F  mA), 

if  A  V  :  S  and  ■  F  K  :  »T  then  V  1 1< 

We  call  this  the  no-counterexamples  interpretation  of  subtyping.  Clearly,  if  we  have  an  explicit 
witness  to  the  safety  of  a  subtyping  relationship  using  the  identity  coercion,  then  there  can  be 
no  counterexamples. 

2Beware  that  our  relation  E  _L  a  is  in  fact  the  precise  dual  of  that  of  Girard  [2001],  who  defines  orthogonality 
as  normalization  towards  (rather  than  away  from)  15.  The  idea  of  defining  an  orthogonality  relation  by  safety  rather 
than  termination  comes  from  Mellies  and  Vouillon,  inspired  by  Krivine-style  realizability  [Danos  and  Krivine,  2000, 
Krivine,  2001]. 
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Theorem  6.2.31  (Soundness).  'Ll  <  ^>2  implies  'Ll  ^  ^2,  and  S  <T  implies  S  ^  T 

Proof.  Let  ■  h  a  :  'Ll.  \&2  h  E  :  /.  By  Prop.  6.2.16,  we  can  use  the  witness  \Li  <  ^2  either  to 
coerce  the  substitution  to  •  h  a  :  ^>2,  or  the  expression  to  \Li  h  E  :  /.  In  either  case,  we  have  a 
well-sorted  program  (bind(emp,  a)  \  E ),  which  cannot  evaluate  to  15  by  the  type  safety  theorem. 
We  reason  similarly  to  show  that  S  <T  implies  S  E  T.  □ 

The  really  interesting  question  is  completeness:  if  the  identity  coercion  does  not  check,  can  we 
come  up  with  an  explicit  safety  violation,  i.e., 

does  S  T  imply  S  f  T,  and  'L  ji  \L'  imply  'L  ^  \b'? 

This  question  does  not  have  a  completely  straightforward  answer,  however.  The  identity  co¬ 
ercion  interpretation  is  fixed  by  the  pure,  logical  fragment  of  C+ ,  but  the  no-counterexamples 
interpretation  depends  on  the  set  of  non-logical  effects  we  have  available.  Indeed  even  just  to 
define  we  needed  the  "unsafe"  aborting  expression  15,  as  well  as  at  least  one  other  "safe" 
observable  result  (11  in  our  case)  for  ^  to  be  non-trivial. 

In  this  respect,  the  relationship  between  <  and  ^  is  reminiscent  of  the  relationship  between 
definitional  equality  and  observational  equivalence  that  we  already  explored  in  Chapter  4.  We 
could  hope  to  settle  the  completeness  question  similarly,  by  including  an  additional  effect  in  C 
sufficient  for  building  counterexamples  to  any  invalid  subtyping  laws — just  as  we  used  ground 
input  to  observationally  distinguish  any  two  syntactically  distinct  C+  expressions  (§4.2.15).  Such 
an  approach  works  technically,  but  morally,  these  effects  are  a  little  strange. 

Essentially,  we  need  two  kinds  of  nondeterminism:  demonic  and  angelic.  Demonic  nonde¬ 
terminism,  written  E\  X  E2,  is  the  more  mundane:  the  expression  E\  X  E2  can  step  to  either 
E\  or  E2,  leaving  the  choice  up  to  a  maximally  malicious  environment — and  so  E\  X  E2  is  safe 
in  an  environment  only  if  both  E\  and  E2  are.  This  is  how  we  often  model  nondeterminism 
in  a  programming  language,  because  we  believe  in  Murphy's  Law,  and  want  to  know  that  a 
program  will  stay  afloat  even  if  everything  that  can  possibly  go  wrong  does  go  wrong.  Angelic 
nondeterminism,  written  E\  T  E2,  is  the  more  mystical:  an  expression  E\  T  E2  can  again  step  to 
either  E\  or  E2,  but  now  the  choice  is  made  by  a  benevolent  environment,  meaning  that  E\  T  E2 
is  safe  if  either  E\  or  E2  are.  This  kind  of  angelic  nondeterminism  for  type  safety  is  more  difficult 
to  accept  in  a  programming  language,  although  (in  this  binary  version)  it  makes  sense  from  a 
purely  computational  standpoint. 

So,  adding  both  X  and  T  to  C+  and  considering  the  completeness  question  settled  seems  in 
poor  taste.  I  will  show  that  this  is  really  true,  i.e.,  that  having  these  operations  is  sufficient  for 
generating  subtyping  counterexamples.  However,  rather  than  taking  them  as  generic  primitives, 
we  can  consider  them  as  properties  of  particular  types.  And  here  I  really  do  mean  intrinsic  types, 
rather  than  their  extrinsic  refinements.  A  frame  A  defines  a  particular  collection  of  expressions, 
and  we  can  think  of  the  operations  X  or  T  as  making  more  or  less  sense  when  restricted  to  this 
particular  collection.  In  some  cases,  we  can  define  X  or  Y  for  a  frame  in  terms  of  the  operations 
on  smaller  frames  (in  the  sense  of  the  definition  ordering) — as  a  trivial  example,  in  C+  we  can 
always  define  both  operations  for  the  empty  frame  (which  only  has  the  expressions  Pi  and  15) 
by:  ' 


E  X  15  =  15  E  X  Pi  =  E  Erl5  =  E  E  Y  Pi  =  Pi 
Therefore  we  can  consider  a  more  refined  but  open-ended  version  of  the  completeness  question: 
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under  reasonable  computational  assumptions,  for  zvhich  types/frames, 
does  S  T  imply  S  ^ a  T,  and  ^  j£a  ’h'  imply  'T  ^a  HT? 


6.2.11  Some  counterexample  examples  (and  counterexamples) 

Before  considering  the  general  version  of  the  completeness  question,  I  want  to  better  illustrate 
the  nature  of  the  question  with  some  specific  examples  of  counterexamples,  and  some  seeming 
counterexamples  to  the  existence  of  counterexamples. 

Often,  when  S  <T  fails  for  "obvious"  reasons,  it  is  easy  to  come  up  with  counterexamples  to 
S  ^T.  For  instance,  it  is  trivial  that  T  F,  and  we  can  easily  come  up  with  a  counterexample: 


V  =  TT 


K  tt  =  U 

k  ff  =  n 


We  have  V  :  T  and  K  :  »F,  but  V  /  K.  Therefore: 

Proposition  6.2.32.  T  ^  F 

The  really  interesting  completeness  questions  have  to  do  with  the  non-distributivity  of  intersec¬ 
tions  through  negation  (Proposition  6.2.20).  In  the  0-ary  case  ^  T  ji  i,  we  can  always  come  up 
with  a  counterexample  by  pairing  the  continuation  which  diverges  on  all  inputs,  treated  as  a 
value,  together  with  the  continuation  that  ignores  its  argument  and  aborts.  That  is,  we  take 

V  =  (p  n)  K  k  =  15 

(Really  we  should  be  writing  _[(p  t— >•  Q)]  for  the  value,  but  the  meaning  of  this  shorthand  is 
hopefully  clear.)  Note  that  whatever  the  type  T  A  the  sorts  refine,  we  have  V  :  +  T  and  K  :  •_ 
and  V  /  K.  Hence, 

Proposition  6.2.33.  For  all  positive  types  A,  ->T  ^ a  -IL. 

But  what  about  the  binary  case,  ^(S  (IT)  f  T  S  U  T  T?  Let's  consider  it  at  almost  the  simplest 
possible  type,  -iB,  where  already  there  is  a  bit  of  subtlety.  Take  S  =  T  and  T  =  F.  Can  we 
come  up  with  a  counterexample  to  show  that  -i(T  n  F)  ^  -iT  U  -iF? 

It  is  easy  to  come  up  with  a  counterexample  to  the  value  inclusion.  Take  the  continuation  that 
aborts  on  any  boolean,  treated  as  a  -iB-value: 

V  =  (b^U) 

We  have  that  V  :  NT  n  F)  (because  T  n  F  =  1  refutes  every  B-pattern)  but  neither  V  :  ->T  nor 
V  :  -iF.  However,  failure  of  value  inclusion  is  not  sufficient  for  producing  a  counterexample  to 
safety — we  also  need  to  give  a  continuation  K  accepting  sort  -iT  U  -iF  that  is  unsafe  for  V.  What 
would  such  a  continuation  look  like? 

Well,  K  takes  a  B-continuation  variable  k  as  an  argument,  and  does  something  with  it.  But 
what  can  it  do?  Because  K  accepts  sort  ->T  U  -iF,  the  body  of  the  continuation  K{ k)  must  be 
well-sorted  whether  we  assume  k  :  «T  or  k  :  *F.  But  if  we  ever  try  to  pass  a  boolean  value  to 
k,  it  must  be  either  TT  or  FF,  and  hence  the  application  will  be  ill-sorted  under  one  of  these 
assumptions.  Basically,  we  see  that  in  our  language — C+  with  Q  and  O,  and  no  other  additional 
effects — the  only  continuation  accepting  sort  ->T  U  -iF  is  the  one  that  ignores  its  argument  and 
diverges: 
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k  k  =  n 

Yet,  this  K  also  accepts  -i(T  n  F),  and  indeed  is  safe  for  V.  Thus,  we  are  led  to  our  first  failure 
of  completeness: 

Proposition  6.2.34.  In  (with  no  additional  effects),  -i(T  nF)  ^  ->T  U  -iF. 

Is  there  any  way  to  recover  completeness,  by  introducing  a  new  effect?  Yes,  this  is  a  situation 
where  the  angelic  choice  operator  E\  Y  E2  can  come  to  the  rescue. 

I  admire  your  perverted  ingenuity  in  inventing  one  definition  after  another  as  barri¬ 
cades  against  the  falsification  of  your  pet  ideas. 

— Alpha,  speaking  to  Delta  in  Imre  Lakatos'  Proofs  and  Refutations 

Definition  6.2.35  (Orthogonal).  The  safety  relation  induces  an  operation  (—)-*-  on  sets  of  terms  t,  called 

the  orthogonal: 

a1-  =  {E  |  Vc 7  €  <7  .  E  A  cr}  E =  |cr  |  \JE  Gfi.Blff} 

V1  =  {K  |  VF  €  V  .  V  ±  IC)  K1  =  {V  |  \/K  G  K  .  V  1 1<} 

We  write  t,1-  for  the  corresponding  operation  on  the  singleton  set. 

Definition  6.2.36  (Choice  operators).  A  frame  A  is  said  to  have  (binary)  demonic  choice  if  for 
every  pair  of  expressions  E\ .  E2  ::  (A  b  #),  there  is  an  expression  E\  X  E2  ::  (Ah  #),  such  that 
(E\  X  Ef)1-  =  Ef  n  Ef,  admitting  the  following  typing  rule  for  all  'L  C  A: 

'L  b  Ei  :  ■/  'Lb  E2  ■  A 
T'  b  Ei  X  E2  :  / 

It  is  said  to  have  binary  angelic  choice  if  for  every  pair  E\ ,  E2  there  is  an  expression  E\  Y  E2  such  that 
(Ei  Y  E-z)1-  =  Ef  U  Ef,  admitting  the  following  typing  rules  for  all  'L  C  A: 

'L  b  Ei  :  ■/  'Lb  E2  :  •/ 

T'  b  Ei  Y  E2  :  /  \L  b  Ex  Y  E2  :  / 

We  say  that  a  type  A  has  [angelic /demonic]  choice  if  the  singleton  frame  (k  :  »A)  does.  Since  these 
properties  hold  relative  to  a  language  (an  extension  of  £ J),  we  speak  of  a  language  having  [angelic /demonic] 
A-choice  or  A-choice. 

In  other  words,  expanding  the  definition  of  the  orthogonal,  for  any  substitution  a,  we  have 
Ei  X  E2  -L  a  iff  both  E\  _L  a  and  E2  _L  a,  while  Ei  Y  E2  A.  a  iff  either  E\  _L  cr  or  E2  T  a.  Observe 
that  binary  choice  implies  finite  choice,  because  H  and  ()  provide  the  units  of  the  operations: 

Proposition  6.2.37.  We  can  build  the  demonic/angelic  choice  of  finitely  many  expressions  by 


Note  that  the  demonic  choice  of  finitely  many  expressions  is  ivell-sorted  if  all  of  the  expressions  are,  and 
the  angelic  choice  is  ivell-sorted  if  at  least  one  of  the  expressions  is. 
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In  fact,  binary  demonic  choice  implies  countable  demonic  choice,  essentially  because  failure  is 
finite — but  we  will  have  no  need  for  this  below.3  Note  also  that  X  and  Y  are,  by  definition, 
associative,  commutative,  and  idempotent  with  respect  to  safety.4 

Proposition  6.2.38.  In  any  extension  of  Cf  with  angelic  15-choice,  -i(T  nF)  ^  ->T  U  -iF. 

Proof.  We  pair  the  value  V  =  (b  i— >  15)  with  the  continuation  K  n  =  k  TT  Y  k  FF.  Note  that 
V  :  -i(T  n  F)  and  K  :  »-iT  U  -iF,  but  V  /  K.  □ 

Does  this  make  sense?  Perhaps.  We  might  think  of  the  angelic  choice  operator  E\  Y  E2  as  first 
executing  E\  in  a  "sandbox",  and  if  anything  goes  wrong,  throwing  the  computation  away  and 
executing  E2.  Thus  this  program  will  try  passing  TT  to  (6  1— >  ?)),  fail,  and  then  try  passing 
FF  to  ( b  1— >  15) — but  that  also  fails.  Or  we  might  really  interpret  the  angelic  choice  as  being 
implemented  by  our  guardian  angel,  who  tries  as  hard  as  they  can  to  find  a  safe  expression 
(which  is  in  this  case  impossible).  Whether  or  not  either  of  these  interpretations  correspond  to 
a  realistic  operational  semantics,  in  any  case  let  us  suspend  disbelief  for  now. 

We  can  build  a  more  interesting  counterexample  by  considering  the  invalid  law  at  higher 
type.  Take  S  =  ->T  and  T  =  -F,  noting  (by  the  valid  distributivity  law  for  positive  negation) 
that  T  n  ~ 'F  =  ~ '(T  U  F).  How  do  we  show  — 1 — '(TF  U  F)  ^  — i— >X  U  — 1 — >F? 

First,  we  have  to  come  up  with  a  value  of  sort  -i-i(T  U  F).  If  we  try  to  do  this  within  the  de¬ 
fragment,  we  quickly  realize  that  any  such  value  will  also  have  one  of  the  sorts  — > — >T  or  — 1 — <F, 
and  so  cannot  play  a  role  in  a  safety  violation.  However,  using  demonic  B-choice,  we  have  a  nice 
candidate: 


V  =  (ft  1 — >  k  TT  X  k  FF) 

We  can  think  of  V  as  a  coin  flip,  i.e.,  as  a  suspended  boolean  computation  that  nondeterminis- 
tically  returns  true  or  false.  Note  that  V  :  _,_,(T  U  F),  but  neither  V  :  — ■— >T  nor  V  :  — > — <F.  What 
about  the  matching  continuation?  By  analogy  from  the  previous  example,  it  is  not  difficult  to 
come  up  with  the  following  continuation: 

K  n*  =  k*  Ktt  Y  k*  Kff 


where  Ktt  and  Kff  are  defined  by 


Ktt  tt  =  Q  Kff  tt  =  15 

Ktt  ff  =  15  Kff  ff  =  O 

It  is  easy  to  verify  that  K  :  iT  U  — ■ — <F  (but  not  K  :  *—1—1  (T  U  F)),  and  that  V  /  K.  But  what 

does  this  program  actually  do? 

It  turns  out  that  we  don't  actually  have  to  postidate  the  angelic  -iB-choice  operation,  because 
we  can  already  implement  it  within  Cf.  Consider  the  following  alternate  definition  of  K: 

K  k*  =  n*  (6  1— ►  K*  Kb) 

3  Because  divergence  is  safe  and  aborting  is  unsafe,  it  is  the  demonic  choice  (rather  than  the  angelic)  that  is 
analogous  to  the  usual  "parallel-or"  construct:  Ei  X  E2  is  unsafe  (i.e.,  terminates  with  failure)  if  either  Ei  or  E2  is 
unsafe. 

iX  and  Y  can  be  seen  as  lattice  operations  for  the  opposite  of  the  approximation  ordering  E  <  E'  defined  in 
§4.2.14.  The  flip  is  because  "less  defined"  coincides  with  "more  safe". 


156 


which  can  be  expanded  out  to 


Again,  the  reader  can  verify  that  K  :  >T  U  — ■— iF  (but  not  K  :  •— > — '(T  U  F)),  and  that  V  JLK.  And 

now  it  is  clear  what  is  going  on:  K  evaluates  its  argument  twice  (i.e.,  passes  it  a  B-continuation), 
and  checks  that  it  returns  the  same  value  each  time.  Hence  K  can  go  wrong  when  paired  with 
the  value  V,  which  sometimes  returns  TT  and  sometimes  FF. 

Proposition  6.2.39.  In  any  extension  of  with  demonic  H-choice,  -i-i(T  UF)  ^  >T  U  — >F. 

6.2.12  The  (conditional)  completeness  of  the  identity  coercion  interpretation 

Let  us  try  to  collect  some  of  the  previous  observations  into  a  more  systematic  answer  to  the 
subtyping  completeness  question. 

Recall  (from  §2.1.2)  that  the  relation  is  defined  as  the  transitive  closure  of  the  following 
clauses: 

K:deA  p::(A\\~B) 

A  x  A  A  xB 

And  again,  recall  that  the  types /frames  below  a  type /frame  in  the  Y  relation  are  called  its 
ancestors. 

Definition  6.2.40  (Ancestral  properties).  We  speak  of  a  property  being  ancestral  for  a  type/frame,  if 
it  holds  for  all  of  its  ancestors  (though  not  necessarily  for  itself).  In  particular,  we  say  that  a  type/frame 
has  ancestral  choice  if  all  of  its  ancestor  frames  have  both  angelic  and  demonic  choice. 

Notation.  If  A  has  ancestral  choice,  zve  can  define  choice  operators  for  A-continuations  by  the  following 
maps: 

(R\  X  K2)(p)  =  K\ (p)  X  K2(p)  (IU  T  K2)(p)  =  K2(p)  Y  K2(p) 

Similarly,  if  A  has  ancestral  choice,  zve  can  define  choice  operators  for  A-substitutions  by  the  following 
maps: 

(ui  X  ct2)(k)  =  cti(k)  X  a2(n)  (cti  Y  <t2)(k)  =  ct2(k)  Y  ct2(k) 

Proposition  6.2.41.  The  derived  choice  operators  for  continuations  satisfy  the  following  orthogonality 
conditions: 


(Kx  X  K2)l  =  Kf  n  Kf  {Kf  Y  Kf)1-  =  Kf  U  Kf 


and  admit  the  following  refinement  typing  rules: 


■  F  Kx  :  mS  •  F  K2  :  mS 
■  \-  K]  X  K2  :  »S 


•FA i  :  .5 

•  F  K]  Y  I<2  :  »S 


■ F  K2  :  »S 
•  F  Ki  Y  K2  :  .5 


Proof.  Immediate  from  the  definition  of  the  derived  choice  operators  and  the  refinement  typing 
rules  for  continuations.  □ 
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Proposition  6.2.42.  The  derived  demonic  choice  operator  for  substitutions  satisfies: 

(<7i  X  a-2)±  C  erf  n  af 

and  admits  the  following  typing  rule: 

■  b  (Ji  :  Vk  ■  h  (72  :  'P 
■  h  a\  X  <72  :  4/ 

Proof.  The  orthogonality  conditions  express  that  if  an  expression  is  safe  for  a  demonic  choice 
of  substitutions  then  it  is  safe  for  each  substitution.  This  is  clear  because  the  demonic  choice 
can  always  mimic  the  behavior  of  either  substitution  during  an  execution.  Note  however  that 
the  converse  does  not  necessarily  hold — by  analogy,  the  empty  demonic  choice  (the  substitution 
n  i — >  p  i — ►  ff)  is  not  safe  for  every  expression  (in  particular  it  is  unsafe  for  15). 

The  admissibility  of  the  typing  rule  follows  immediately  from  the  definition  of  the  demonic 
choice  operator  and  the  derived  typing  rule  for  substitutions  (Prop.  6.2.4).  □ 

In  order  to  simplify  the  statement  and  proof  of  the  completeness  theorem,  we  make  the  following 
observation: 

Observation  6.2.43  (Finitary  polymorphism).  For  all  sorts  S  C  A  and  A-patterns  p,  the  set  \p  :  S{| 
is  finite.  Likewise,  for  all  frame  refinements  <k  C  A  and  variables  k  :  »A  G  A,  the  set  T>(n)  is  finite. 

This  is  a  property  of  our  type  system,  although  one  could  easily  imagine  an  extension  with 
infinitary  polymorphism  that  breaks  it — then  we  would  have  to  deal  explicitly  with  countable 
choice  operators.  We  now  state  the  main  lemma: 

Notation.  For  any  sort  S  A  A,  we  write  {S}v  for  the  set  of  values  (of  type  A)  [SJy  =  {V  \  ■  h  V  :  5} 
and  \S\K  for  the  set  of  continuations  (accepting  A)  =  {K  \  ■  F  K  :  •£}.  For  any  frame  refinement 
'F  jZ  A,  we  write  1^}E  for  the  set  of  expressions  (in  A)  |[\k]]B  =  {E  |  \k  F  E  :  /}  and  [['k]]fr/or  the  set 
of  substitutions  (for  A)  [[\k]CT  =  {a  \  ■  F  a  :  T'}. 

Lemma  6.2.44.  Let  S  C  A,  T'  C  A,  and  TAB.  Suppose  A  has  ancestral  choice.  Then  for  any 
p  ::  (A  IF  B)  and  k  :  »A  e  A: 

1.  If  T'  fipT  then  there  is  some  I<  e  {T}K  and  a  g  [['k]]cr  such  that  p[a]  /  K. 

2.  If  S  'k  then  there  is  some  value  V  €  [5]^  and  substitution  a  €  [['kj^,  such  that  V  /  a(n) 
Proof.  Recall  the  axiomatization  of  \k  arid  S  <K  tk  (Theorem  6.2.22): 

3^'  e  Ip  ■.  r]  v.s  e  v'(k)  s<kt>  e\p:SJ  3«r  e  ®(k)  V  <P  t 

Inverting  these  rules,  we  construct  the  counterexamples  by  recursion  on  the  definition  ordering: 

1.  From  the  assumption  'k  fp  T,  we  have  that  for  all  \k'  G  \p  :  T],  there  is  some  continuation 
variable  k,  and  •.S',  G  T'(k,)  such  that  .S',  T.  By  applying  (2)  recursively  (on  .S',  -< 
\k'  -<  T),  we  obtain  a  (finite)  set  of  values  Vi  G  |{.S',JV-  and  substitutions  u,  G  |fvkjff  such  that 
V.  /  crfim).  Now  define  K  G  [[T]A-  by  setting  K  p  =  Y fim  V,),  and  cr  G  [['k]lCT  by  setting 
a  Ki  =  •  (jj{ni )  (and  setting  K  p* 1  and  a  n!  to  arbitrary  type  safe  expressions /continuations 

everywhere  else). 
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First,  we  can  verify  that  K  and  o  really  have  the  indicated  types.  In  particular,  for  K  we 
must  check  that  for  all  'P'  £  \p  :  T],  the  expression  Y((k,;  Vf  is  well-typed  in  'P',  and  this 
holds  because  in  particular  nt  Vi  is  well-typed.  Likewise,  for  cr  we  must  check  that  for  all 
•S  £  \P(Ki),  Xjaj{Ki)  £  [.S']  K,  and  this  holds  because  Oj(Ki)  £  [S]  K  for  all  j. 

Second,  we  can  verify  these  provide  a  safety  violation: 


p[o\  _L  K  iff 

Y i(Ki  Vi)  -L  O' 

iff 

Bi.Ki  Vi  A  o 

iff 

3i.V  T  Xj  aj(Ki 

iff 

3i.Vj.V  A  Oj(Ki 

but  Ki  V  jf-  Oi  for  all  i,  so  p[o]  /  K. 


2.  From  the  assumption  S’  Yk  'N  we  know  there  is  some  value  pattern  p  and  \P'  £  \p  :  S'], 
such  that  for  all  £  \P(k),  we  have  'P/  Yp  Y-  By  applying  (1)  recursively  (on  \P'  -s 
we  obtain  a  (finite)  set  of  continuations  Kt  £  K  and  substitutions  o,  £  [f'P'J^  such  that 
p[oi ]  /  IV.  Now  define  V  £  [S]y  by  V  =  p[Xj  of  and  o  £  [T^  by  a  k  =  Yj  FQ. 

First  we  verify  the  types.  To  show  V  £  |f,S']  v,  we  need  that  Xjaj  £  [’P/](j,  and  this 
holds  because  Oj  £  [iP'Jg.  for  all  j.  To  show  a  £  ['P](7,  we  need  that  for  all  »Tj  £  \P(k), 
Y,  K,  £  [fT?]  K,  and  this  holds  because  in  particular  Kj  £  ||T?]  K. 

Second,  we  verify  the  safety  violation: 


V  T  ct(k) 


iff  3i.p[Xj  oj]  T  I\i 
implies  3i.Vj  -p[oj]  _L  Ki 


but  p[oi]  /  Ki  for  all  i,  so  V  /  cr(/«). 


□ 


Theorem  6.2.45  (Conditional  completeness  of  subtyping). 

1.  If  A  has  ancestral  choice,  then  S  Xa  T  implies  S  Xa  T. 

2.  If  A  has  ancestral  choice,  then  <P  Ya  'P/  implies  <P  Ya  'P'. 

Proof.  Recall  that  the  following  rules  are  complete  for  the  subtyping/ subframing  relationships: 

£  Ip  :  5]  'P  <p  T  V«5  £  *'(«)  S  <«  * 

S  <  T  T  < 

Inverting  these  rules,  we  construct  counterexamples  to  subtyping  by  applying  the  previous 
lemma. 

1.  There  exists  a  pattern  p  ::  (A  IF  A),  'P  C  A  £  |p  :  S],  some  continuation  I\  £  [T] K  and 
substitution  a  £  ['P](7,  such  that  p[a]  /  K.  Since  p[cr]  £  [S]y,  this  provides  the  desired 
counterexample . 
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2.  There  exists  a  variable  k  :  »A  6  A,  iS  C  A  £  T'(k),  some  value  V  €  [SJy  and  substitution 
a  G  [[’I']]CT,  such  that  V  /  cr(«).  Then  k  V  JL  a  provides  the  desired  counterexample. 


□ 

After  we  have  a  conditional  completeness  theorem,  naturally  the  next  question  is  whether  we 
can  make  it  less  conditional.  Of  course,  the  simplest  answer  is  "Yes",  if  we  are  willing  to  take 
the  choice  operators  as  primitives  in  our  language. 

Theorem  6.2.46.  In  extended  with  angelic  and  demonic  choice  operators  at  generic  frames,  the  identity 
coercion  interpretation  is  sound  and  complete  for  the  no-counterexamples  interpretation. 

Proof.  Immediate  corollary  of  the  soundness  theorem  (6.2.31)  and  the  conditional  completeness 
theorem.  □ 

On  the  other  hand,  we  saw  in  the  previous  section  that  there  are  situations  where  we  can  already 
build  a  choice  operation  at  a  particular  frame  out  of  choice  operations  at  smaller  frames.  Can  we 
retain  completeness  while  sufficing  with  fewer  primitives?  And  what  happens  when  we  relax 
the  assumption  of  finitary  polymorphism?  The  general  answer  to  these  questions  is  beyond  our 
scope  here.  In  a  sense,  the  choice  operations  are  "topological  properties",  and  so  fully  answering 
these  questions  about  subtyping  requires  developing  a  topological  interpretation  of  C.  I  think 
that  is  a  very  worthwhile  project,  but  will  not  pursue  it  here. 


6.3  Refining  full  C 

By  now,  we  have  been  through  enough  iterations  of  dualizing  that  it  is  probably  a  better  exercise 
for  the  reader  to  work  out  the  rules  of  refinement  typing  for  negative  types  on  their  own,  rather 
than  reading  through  more  definitions.  For  reference,  however,  we  include  the  rules  of  refine¬ 
ment  typing  for  full  C  in  Figure  6.3,  as  well  as  the  definition  of  negative  intersections/ unions 
and  some  mixed  polarity  refinement  constructors  in  Figure  6.4. 

6.4  Related  Work 

I  gave  a  brief  history  of  ML  typing  and  of  refinement  types  at  the  start  of  the  chapter.  Intersection 
and  union  types  are  also  considered  in  ludics,  and  in  fact  Girard  [2001]  sent  out  "an  invitation  to 
revisit  the  extant  approaches  to  subtyping  in  the  light  of  ludics".  In  some  ways  this  work  can  be 
seen  as  answering  that  invitation,  although  it  originated  from  the  practical  motivation  of  better 
understanding  the  strange  sort  of  operationally-sensitive  typing  phenomena  uncovered  by  Davies 
and  Pfenning  [2000]  and  Dunfield  and  Pfenning  [2004],  The  question  of  subtyping  completeness 
investigated  in  §6.2.10-§6.2.12  is  an  instance  of  the  more  general  question  of  completeness  in 
proof  theory,  i.e.,  the  relationship  between  proofs  and  countermodels.  That  question  of  course 
has  been  studied  by  many  people.  It  seems  very  likely  that  these  questions  can  be  fruitfully 
stated  in  more  topological  terms,  particularly  along  the  lines  of  Paul  Taylor's  Abstract  Stone 
Duality  and  Martin  Escardo's  synthetic  topology  [Taylor,  2002,  Escardo,  2004], 
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Frame  refinements 

f  CA  ::=  /c  :  •£  |  s  :  S  |  •  |  i,  |  T  |  i  A 
Context  refinements 
SET  ::=  •  |E,tf 

Refinement  typing  judgments 
E  F  V  :  S  V  has  sort  S 
5h  K  :  »S  K  accepts  sort  S 
5  F  a  :  \F  cr  satisfies  \F 

5h  E  :  /  E  is  well-sorted 


Value  and  continuation  typing 


\F  £  Ip  :  5+J  E  h  a  :  \F 
E  h  p[a]  :  S+ 


'F  £  Ip  :  S+j  — >  E,  <F  F  K(p)  :  / 
E  F  K  :  *S+ 


*€ld:»S-j  — >  S,$hF(d):/ 


^  <E  {d  :  .S'-] 

E  F  d[a\  : 


Complex  hypotheses 

*1  £  b  :  Sti  ~  ~  ~  E  [p  :  — >  5(^1  A  •  •  •  A  frn)  F  tp  :  J 

E(x  :  A  •  •  •  A  x  :  S+)  F  case  x  of  p  *—>  tp  :  J 

ti£[d:  •S^j  •  •  •  \Fn  £  |d  :  »5~]  — >  5(^i  A  •  •  •  A  frra)  F  td  :  J7 

E(k  :  •S'-f  A  ■  ■  ■  A  k  :  •<S'“)  F  case  k  of  d  ^  td  ■  J 


Substitution  typing 


5  F  a  :  \Fi  E\~  a  :  ^2 

S  F  a  :  X  S  F  a  :  \Fi  A  IF2  H  F  •  :  • 


E  F  a±  :  \Fi  E  F  02  :  ^2 
H  F  (0-1,02)  :  (^1,^2) 


Expression  typing 


.5+  £  E(/e)  S  F  V  :  S+ 
E  F  k  V  :  / 


£  S(x)  EhK:  *S~ 
E  F  x  K  :  / 


E  F  n  :  /  El/U  :  / 


Figure  6.3:  Refinement  typing  of  £ 
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ld:»SuTj  =  {(®iA^2)  |  G  [d:. 5],  G[d:. 7]} 

[di.SnTj  =  {d:  .5]  U  [d  :  *T] 

Id:«X  1  =  {T} 

[d:«T]  =  0 

I*  :  I5'-!  =  {(*  :  -S'")} 

[k:«T5,+]  =  {(k:»S+)} 

lp@d:*S+^T-]  =  {(^.^I^Gb^+L^Gldi.T-]} 

Figure  6.4:  Definition  of  negative  intersections/unions  and  some  mixed  polarity  refinement 
constructors 
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Chapter  7 

Conclusion 


We  have  given  a  new  take  on  the  proofs-as-programs  analogy,  that,  I  hope,  gives  it  new  force. 
By  analyzing  the  structure  of  proofs  and  refutations  in  terms  of  patterns,  we  have  designed  a 
propositions-as-types  interpretation  that: 

•  Accounts  for  features  in  modern  functional  programming  languages,  such  as  evaluation 
order  and  pattern-matching. 

•  Includes  the  ability  to  mix  evaluation  strategies,  and  to  define  types  by  either  their  con¬ 
structors  or  destructors. 

•  Elegantly  accounts  for  untyped  computation,  intrinsic  types,  and  extrinsic  refinement  types. 

•  Is  in  many  ways  easier  to  reason  about  meta-theoretically  than  standard  typed  A-calculus, 
by  using  techniques  from  infinitary  proof  theory. 

Here  we  have  only  scratched  the  surface,  though.  Important  theoretical  work  that  needs  to  be 
done  is  to  extend  our  approach  to  the  rest  of  type  theory,  including  but  not  limited  to  parametric 
polymorphism,  modalities,  module  systems,  and  dependent  types.  The  latter,  in  particular,  we 
used  in  Chapter  5  to  give  embeddings  of  our  language  into  two  existing  languages — it  seems 
only  fair  that  we  should  be  able  to  account  for  dependent  types  on  our  own,  and  give  a  meta¬ 
circular  interpreter  for  C  in  C. 1  And  polarization  and  focusing  would  hopefully  shed  light  on 
some  of  the  open  problems  related  to  these  features,  in  particular  the  treatment  of  equality,  and 
how  to  account  for  effects. 

Another  important  project  that  remains  to  be  carried  out  is  to  apply  our  theoretical  framework 
towards  the  construction  of  a  real  working  compiler  for  a  functional  language,  with  a  rich  type 
theory  including  intersections  and  unions.  Ideally,  we  could  rely  on  proof-theoretic  principles 
as  guides  at  each  stage  of  the  compiler  pipeline.  We  took  a  first  step  towards  this  goal  with 
the  embeddings  of  Chapter  5,  particularly  the  Twelf  embedding,  which  made  the  syntax  of  the 
language  first-order  via  defunctionalization.  It  seems,  moreover,  that  there  is  a  link  to  be  drawn 
with  the  recent  work  initiated  by  Danvy  [Danvy,  2003,  Ager  et  al.,  2003,  Danvy  and  Millikin,  2006, 
Danvy,  2008],  which  attempts  to  build  systematic  connections  between  definitional  interpretors 

:A  restricted  formulation  of  dependent  types  in  polarized  type  theory  has  already  been  developed  by  Licata  and 
Harper  [2009],  and  should  be  sufficient  for  defining  this  meta-circular  interpreter. 
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and  abstract  machines,  by  way  of  systematic  program  transformations  such  as  CPS  translation, 
closure-conversion,  and  defunctionalization. 

At  a  more  basic  level,  though,  we  have  to  evaluate  whether  the  type  theory  is  sufficiently 
general  for  reasoning  about  effects  at  a  fine  enough  level  of  granularity  The  language  C  intrinsi¬ 
cally  enforces  continuation-passing-style,  and  that  means  that  effects  are  always  sequentialized, 
and  the  type  system  always  sound.  But  it  could  also  mean  that  sometimes  programs  are  over- 
sequentialized  and  the  type  system  too  conservative,  when  no  effects  are  present.  Modalities  for 
enforcing  purity/ linearity  would  no  doubt  be  helpful.  But  it  may  also  be  necessary  to  move  to  a 
more  general  abstraction  such  as  delimited  continuations  [Danvy  and  Filinski,  1990].  Hopefully, 
even  in  a  more  general  framework,  the  basic  concepts  of  polarity  and  focalization  should  remain 
illuminating. 
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Appendix  A 

Agda  embedding  of  C+ 


module  LPos  where 


--  TYPES 


data  Pos  :  Set  where 

_+_  :  Pos  ->  Pos  ->  Pos 
void  :  Pos 

:  Pos  ->  Pos  ->  Pos 

unit  :  Pos 
-i  :  Pos  ->  Pos 

bool  :  Pos 
nat  :  Pos 
dom  :  Pos 
infixr  15  _+_ 
infixr  16 


--  binary  sums 
--  void 

--  binary  products 
--  unit 

--  continuations 
--  booleans 
--  naturals 
--  recursive  domain 


--  FRAMES  &  INDICES 


data  Frame  :  Set  where 
•_  :  Pos  ->  Frame 
■  :  Frame 

:  Frame  ->  Frame  ->  Frame 
infixr  13 

infix  10 

data  :  Frame  ->  Frame  ->  Set  where 

here  :  V  {A}  ->  A  S  A 

left  :  V  {A  Ai  A2>  ->  A  e  Ai  ->  A  e  (Ax  ,  A2) 
right  :  V  {A  Ai  A2}  ->  A  £  A2  ->  A  €  (Ai  ,  A2) 


--  PATTERNS 


infix  10  _lh_ 

data  _lh_  :  Frame  ->  Pos  ->  Set  where 
hole  :  V  {A}  ->  •  A  lh  ->  A 
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#<>  :  •  lb  unit 
#inl  :  V  {A  A  B} 

->  A  lb  A 
->  A  lb  A  +  B 
#inr  :  V  {A  A  B} 

->  A  lb  B 
->  A  lb  A  +  B 
#pair  :  V  {Ai  A2  A  B> 
->  Ai  lb  A  ->  A2  lb  B 
->  Ai  ,  A2  lb  A  *  B 
#tt  :  •  lb  bool 
#ff  :  •  lb  bool 
#z  :  •  lb  nat 
#s_  :  V  {A} 

->  A  lb  nat 
->  A  lb  nat 
#dn  :  V  {A} 

->  A  lb  nat 
->  A  lb  dom 
#dk  :  V  {A} 

->  A  lb  -1  dom 
->  A  lb  dom 
infixr  15  #s_ 


--  CONTEXTS  &  JUDGMENTS 


data  Ctx  :  Set  where 
:  Ctx 

:  Ctx  ->  Frame  ->  Ctx 
inf ixl  12  _  ,  ,  _ 

infixr  10  _GG_ 

data  _GG_  :  Frame  ->  Ctx  ->  Set  where 

xo  :  v  {A  r  A’}  ->  A  e  A’  ->  A  ee  r  ,,  A’ 
xs_  :  v  {A  r  A’}  ->  A  gg  r  ->  A  gg  r  , ,  A> 

infixr  15  xS_ 

data  J  :  Set  where 
True  :  Pos  ->  J 
False  :  Pos  ->  J 
All_  :  Frame  ->  J 
#  :  J 

infix  10  All_ 


--  TERMS 


infix  8  _b_ 
infix  13  _  [_] 
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codata  _b_  :  Ctx  ->  J  ->  Set  where 
--  values 
_[_]  :  V  {r  A  4} 

->  A  lb  A  ->  T  b  All  A 
->  r  b  True  A 
--  continuations 
con_  :  V  {r  A} 

->  (V  {A}  ->  A  lb  A  ->  T  ,,  A  b  #) 

->  r  b  False  A 

--  substitutions 

sHole  :  V  A}  ->  T  b  False  A  ->  T  b  All  •  A 
sNil  :  V  {D  ->  T  b  All  • 
sJoin  :  V  -{T  Ai  A2} 

->rb  ah  Ai  ->  r  b  ah  a2  ->  r  b  ah  a-,  ,  a2 

--  expressions 
throw  :  V  {r  A} 

->  •  A  GG  r  ->  T  b  True  A 

->  r  b  # 

u  :  v  {r}  ->  r  b  # 


--  SOME  EASY  LEMMAS 


appSub  :  V  {r  A} 

->  r  b  ah  a 

->  (V  {A}  ->  •  A  G  A  ->  r  b  False  A) 
appSub  (sHole  K)  here  ~  K 

appSub  sNil  0  --  the  empty  frame  has  no  variables,  so  this  case  is  refuted 

appSub  (sJoin  a \  cr2)  (left  k)  ~  appSub  ai  k 
appSub  (sJoin  a \  cr2)  (right  k )  ~  appSub  cr2  k 

sub_  :  V  -{T  A} 

->  (V  {A}  ->  •  A  £  A  ->  r  b  False  A) 

->  T  b  All  A 
sub_  {A  =  •}  f  sNil 

sub_  {A  =  Ai  ,  A2>  f  ~  sJoin  (sub  \x  ->  f  (left  x))  (sub  \x  ->  f  (right  x)) 
sub_  {A  =  •  _}  f  sHole  (f  here) 

appCon  :  V  {r  A} 

->  T  b  False  A 

->  V  {A}  ->  A  lb  A  ->  T  ,,  A  b  # 
appCon  (con  K)  p  '  K  p 

transG  :  {A-|  A2  A3  :  Frame} 

->  Ai  G  A2  ->  A2  g  A3  ->  Ai  G  A 3 
transG  Ai  here  =  Ai 
transG  here  Ai  =  Ai 

transG  Ai  (left  A2)  =  left  (transG  Ai  A2) 

transG  Ai  (right  A2)  =  right  (transG  Ai  A2) 
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transGG  :  V  {A-,  A2  T>  ->  A-,  G  A2  ->  A2  GG  T  ->  Ax  GG  T 
transGG  x  (xO  y)  =  xO  (transG  x  y) 
transGG  x  (xS  y)  =  xS  (transGG  x  y) 


--  IDENTITY  TERMS 


mutual 

IdCon  :  V  {r  i}  ->  •  A  GG  T  ->  T  h  False  A 

IdCon  k  ~  con  \p  ->  throw  (xS  n )  (p  [  IdSub  (xO  here)  ]) 

IdSub  :  V  {r  A}  ->  A  GG  T  ->  T  h  All  A 

IdSub  {A  =  •  A}  k  sHole  (IdCon  n) 

IdSub  {A  =  ■}  a  sNil 

IdSub  {A  =  Ai  ,  A2>  a  ~  sJoin  (IdSub  (transGG  (left  here)  a)) 

(IdSub  (transGG  (right  here)  er) ) 


--  CONTEXT  SPLITTING 


--  to  define  composition  (and  weakening) ,  we  first 
--  need  to  define  a  notion  of  context  splitting 
data  split  :  Ctx  ->  Ctx  ->  Frame  ->  Ctx  ->  Set  where 
here  :  V  {A  T} 

->  split  (r  ,  ,  A)  r  A  •• 
skip  :  V  {r  T1  A  r2  A’} 

->  split  r  Ti  A  r2 

->  split  (r  ,,  Ao  r-i  a  (r2  ,,  ao 

assoc  :  V  -{T  Ai  A2  Ti  A  T2} 

->  split  (r  ,,  Ai  ,,  a2)  Ti  a  r2 
->  split  (r  , ,  (  Ai  ,  A2  ))  Ti  A  r2 
nil  :  V  {r  Ti  A  T2> 

->  split  r  Ti  A  r2 
->  split  (r  ,,  •)  Ti  A  r2 

_++_  :  Ctx  ->  Ctx  ->  Ctx 

r-,  ++  ••  =  r, 

Ti  ++  (r2  ,,  A)  =  (Ti  ++  r2)  ,,  a 

infixr  12  _++_ 

data  Either  (A  B  :  Set)  :  Set  where 
Ini  :  A  ->  Either  A  B 

Inr  :  B  ->  Either  A  B 

casevar  :  V  ff  Ti  A  T2  A} 

->  split  r  rt  A  r2  ->  .  a  gg  r 

->  Either  (•  A  G  A)  (•  A  GG  IA  ++  T2) 
casevar  here  (xO  k )  =  Ini  n 

casevar  here  (xS  n )  =  Inr  k 

casevar  (skip  s)  (xO  k )  =  Inr  (xO  k) 
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casevar 


(skip  s)  (xS  ft)  with  casevar  s  ft 
I  Ini  ft’  =  Ini  ft’ 

I  Inr  k’  =  Inr  (xS  ft’) 
casevar  (assoc  s)  (xO  (left  ft) )  =  casevar  s  (xS  (xO  ft)) 
casevar  (assoc  s)  (xO  (right  ft))  =  casevar  s  (xO  ft) 
casevar  (assoc  s)  (xS  k)  =  casevar  s  (xS  (xS  «)) 
casevar  (nil  s)  (xS  ft)  =  casevar  s  ft 
casevar  (nil  s)  (xO  ()) 


--  WEAKENING 


weakvar  :  V  ff  Ti  A  T2  Ao} 

->  split  f  rx  a  r2  ->  Aq  (e<e  ri  ++  r2 
->  A0  ee  r 

weakvar  here  ft  =  (xS  ft) 

weakvar  (skip  s)  (xS  ft)  =  xS  (weakvar  s  ft) 

weakvar  (skip  s)  (xO  ft)  =  xO  ft 

weakvar  (assoc  s)  ft  with  weakvar  s  ft 
. . .  I  xO  ft’  =  xO  (right  ft’) 

. . .  I  xS  xO  k’  =  xO  (left  ft’) 

. . .  I  xS  xS  k’  =  xS  k’ 

weakvar  (nil  s)  ft  =  xS  (weakvar  s  k) 

weaktm  :  V  {r  Ti  A  T2  J} 

->  split  r  r,  A  r2  ->  r!  ++  r2  f  j 
->  r  f  j 

weaktm  s  (p  [  cr  ] )  ~p  [  weaktm  s  cr  ] 

weaktm  s  (con  if)  ~  con  (\p  ->  weaktm  (skip  s)  (ip  p)) 

weaktm  s  (sHole  K)  ~  sHole  (weaktm  s  K) 

weaktm  s  (sNil)  ~  sNil 

weaktm  s  (sJoin  tj\  cr2)  ~  sJoin  (weaktm  s  a\ )  (weaktm  s  cr2) 

weaktm  s  (throw  ft  V)  throw  (weakvar  s  ft)  (weaktm  s  V) 

weaktm  s  15  ~  15 


--  COMPOSITION 


mutual 

ConoVal  :  V  {r  A}  ->  T  h  False  A  ->  T  F  True  A  ->  T  F  # 

ConoVal  K  (p  [  cr  ] )  ~  TmoSub  here  (appCon  K  p)  cr 
TmoSub  :  V  {r  Ti  A  T2  J}  ->  split  TTi  AT2  ->  T  h  J 
->  F i  ++  F2  F  All  A  ->  Fi  ++  r 2  F  J 
TmoSub  s  (throw  ft  V)  cr  with  casevar  s  ft 

...  I  Ini  ft’  ~  ConoVal  (appSub  cr  ft’)  (TmoSub  s  V  cr) 

...  I  Inr  ft’  ~  throw  ft’  (TmoSub  s  V  cr) 

TmoSub  s  (p  [  (To  ]  )  cr  ~  p[  TmoSub  s  cro  cr  ] 

TmoSub  s  sNil  cr  ~  sNil 

TmoSub  s  (sJoin  (T\  cr2)  cr  ~  sJoin  (TmoSub  s  a\  cr)  (TmoSub  s  0-2  cr) 
TmoSub  s  (sHole  K)  cr  sHole  (TmoSub  s  K  cr) 

TmoSub  s  (con  K)  cr  con  \p  ->  TmoSub  (skip  s)  (K  p)  (weaktm  here  cr) 
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TmoSub  s  15  cr  ~  15 


--  ENVIRONMENT  SEMANTICS 


data  Env  :  Ctx  ->  Set  where 
emp  :  Env  •  • 

_bind_  :  V  {r  A}  ->  Env  T  ->  T  h  All  A  ->  Env  (T  ,  ,  A) 
infixl  8  _bind_ 

lookup  :  V  {r  A} 

->  Env  r  ->  •  A  T  ->  T  h  False  A 
lookup  emp  ()  --  impossible  pointer  into  empty  environment 

lookup  (7  bind  cr)  (xO  k)  =  weaktm  here  (appSub  a  k ) 
lookup  (7  bind  cr)  (xS  k)  =  weaktm  here  (lookup  7  k ) 

data  Prog  :  Set  where 

prog  :  V  {r>  ->  Env  T  ->  T  h  #  ->  Prog 

codata  Result  :  Set  where 
abort  :  Result 
step  :  Result  ->  Result 

eval  :  Prog  ->  Result 

eval  (prog  7  (throw  k  (p  [cr]))) 

step  (eval  (prog  (7  bind  cr)  (appCon  (lookup  7  k )  p))) 
eval  (prog  7  15)  abort 


--  SAFETY  TESTING 


data  Bool  :  Set  where 
True  :  Bool 
False  :  Bool 

data  Nat  :  Set  where 
Z  :  Nat 

S_  :  Nat  ->  Nat 
infixr  12  S_ 

safeN  :  Result  ->  Nat  ->  Bool 
safeN  R  Z  =  True 
safeN  abort  (S  _)  =  False 
safeN  (step  R)  (S  n)  =  safeN  R  n 
data  Void  :  Set  where 
data  Unit  :  Set  where 
u  :  Unit 


isTrue  :  Bool  ->  Set 
isTrue  True  =  Unit 
isTrue  False  =  Void 
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isFalse  :  Bool  ->  Set 
isFalse  True  =  Void 
isFalse  False  =  Unit 


--  EXAMPLES 


selfapp  :  V  -{T}  ->  T  h  False  dom 
selfapp  ~  con  K 
where 

K  :  V  {A}  ->  A  Ih  dom  ,  A  h  # 

K  (#dk  hole)  ~  throw  (xO  here)  (#dk  hole  [  sHole  selfapp  ]) 

K  (#dn  ~  15 

ujuj  :  Prog 

ujuj  =  prog  (emp  bind  sHole  selfapp) 

(throw  (xO  here)  (#dk  hole  [  sHole  selfapp  ])) 

safelOww  :  isTrue  (safeN  (eval  ujuj)  (SSSSSSSSSSZ)) 
safelOcuw  =  u 

isEven  :  V  -fr}  ->  •  bool  ££  T  ->  T  h  False  nat 
isEven  n  ~  con  K 
where 

K  :  V  {A}  ->  A  Ih  nat  ,  A  h  # 

K  #z  throw  (xS  n)  (#tt  [  sNil  ]) 

K  (#s  #z)  ~  throw  (xS  k )  (#ff  [  sNil  ]) 

K  (#s  #s  n)  K  n 

branch  :  V  {r}  ->  T  h  #  ->  T  h  #  ->  T  h  False  bool 
branch  Ex  E2  ~  con  K 
where 

K  :  V  {A}  ->  A  Ih  bool  ->  ,,  Ah# 

K  #tt  ~  weaktm  here  Ei 
K  #ff  ~  weaktm  here  E2 

even9  :  Prog 
even9  =  prog  (emp 

bind  sHole  selfapp 

bind  sHole  (branch  (throw  (xO  here)  (#dk  hole  [  sHole  selfapp  ]))  15) 
bind  sHole  (isEven  (xO  here))) 

(throw  (xO  here)  (#s  #s  #s  #s  #s  #s  #s  #s  #s  #z  [  sNil  ])) 

safeleven9  :  isTrue  (safeN  (eval  even9)  (S  Z)) 
safeleven9  =  u 

safe2even9  :  isTrue  (safeN  (eval  even9)  (S  S  Z) ) 
safe2even9  =  u 

unsafe3even9  :  isFalse  (safeN  (eval  even9)  (S  S  S  Z)) 
unsafe3even9  =  u 
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Appendix  B 

Twelf  embedding  of  C+ 


Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

°/o  TYPES 

0  /  0  f  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of 

/o  /o  /  o  /o  /o  /o  /o  /o  /o  /o  /  0  /o  /o  /o  /o  /o  /o  /o  /o  /o 

pos  :  type, 
int  :  pos. 

+  :  pos  ->  pos  ->  pos. 

void  :  pos. 

*  :  pos  ->  pos  ->  pos. 

unit  :  pos. 

-i  :  pos  ->  pos. 

rec  :  (pos  ->  pos)  ->  pos. 


%  primitive  integers 
°i  binary  sums 
7.  void 

7,  binary  products 
7,  unit 

7«  continuations 
7,  recursive  types 


7«infix  right  13  +. 

"/.infix  right  14  *. 

7,7,  some  type  definitions 
bool  :  pos 

=  unit  +  unit . 
nat  :  pos 

=  rec  [X]  unit  +  X. 
list  :  pos  ->  pos 

=  [A]  rec  [X]  unit  +  A 
— >  :  pos  ->  pos  ->  pos 

=  [A]  [B]  4  (A  *  .  B) . 

7«infix  right  12  — » . 

D  :  pos 

=  rec  [X]  X  ->  X. 


7.  booleans 

7.  unary  nats 

7.  cons  lists 
X. 

7.  CBV-CPS  functions 

7,  domain  D  =  D  ->  D 


°/o°/o 

"primitive 

"  arithmetic 

i 

type. 

z 

i . 

s 

•H 

A 

•H 

/.prefix  10  s 

add  :  i  ->  i  ->  i  ->  type. 

"/.mode  add  +M  +N  -P. 
add/ z  :  add  z  N  N . 

add/s  :  add  (s  M)  N  (s  P)  <-  add  MNP. 
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"/.worlds  ()  (add  M  _  . 
"/.total  (M)  (add  M  _  . 
"/.unique  add  +M  +N  -P . 


y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

°/o  FRAMES 

0  /  0  /  0  /  0  /  0 1 Of  0  /  0  /  0 f  Of  0  /  0  /  Of  0  /  0  /  0  /  0 /  Of  0  /  0  / 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

frame  :  type. 

■  :  frame. 

,  :  frame  ->  frame  ->  frame. 
•  :  pos  ->  frame. 

"/.infix  right  11  ,  . 


y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 
/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 


°/o  PATTERNS 

0  /  0  /  0 f  Of  0  /  0 /  Of  0  /  0  /  0 f Of  0  /  0  /  0  /  Of  0  /  0  /  0  /  0  /  Of 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 


Ih  :  frame  ->  pos  ->  type, 
"/.infix  none  9  Ih. 


n  :  i  ->  •  Ih  int . 
ini  :  DA  Ih  A  ->  DA  Ih  A  +  B. 
inr  :  DA  Ih  B  ->  DA  Ih  A  +  B . 
u  :  •  Ih  unit. 

pair  :  DAi  Ih  A  ->  DA2  Ih  B  ->  DAi  ,  DA2  Ih  A  *  B. 
fold  :  DA  Ih  A  (rec  A)  ->  DA  Ih  rec  A. 
hole  :  •  A  Ih  -i  A. 


"/."/.  some  pattern  definitions 
tt  :  •  Ih  bool  =  ini  u. 

ff  :  •  Ih  bool  =  inr  u. 

zz  :  •  Ih  nat  =  fold  (ini  u) . 

ss  :  DA  Ih  nat  ->  DA  Ih  nat  =  [p]  fold  (inr  p)  .  "/.prefix  9  ss. 

nil  :  •  Ih  list  A  =  fold  (ini  u) . 

cons  :  DAi  Ih  A  ->  DA2  Ih  list  A  ->  DAj  ,  DA2  Ih  list  A 
=  [pi]  [p2]  fold  (inr  (pair  pi  p2) ) . 


y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 
/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

°/o  JUDGMENTS 
0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  / 
/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

j  :  type. 


true  : 

:  pos  ->  j. 

"/.prefix 

10 

true . 

false 

:  pos  ->  j . 

"/.pref  ix 

10 

false 

all  : 

frame  ->  j . 

7, prefix 

10 

all. 

#  :  j. 


y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 
/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

°/o  TERMS  &  TERMS- IN-CONTEXT 

0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  /  0  f  Of  Of  Of 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 


tm  :  j  ->  type, 
h  :  frame  ->  j  ->  j . 


186 


"/.infix  right  9  b. 

A_  :  tm  J  ->  tm  (DA  b  J)  . 

A,  :  tm  (DAi  b  DA2  b  J)  ->  tm  (DAi  ,  DA2  b  J)  . 

Aeon  :  (tm  (false  A)  ->  tm  J)  ->  tm  (•  A  b  J) . 

Asub  :  (tm  (all  DA)  ->  tm  J)  ->  tm  (DA  b  J)  . 

"/.prefix  9  A_ .  "/.prefix  9  A,,  "/.prefix  9  Aeon,  “/.prefix  9  Asub. 

7.7.  values  7.7. 

val  :  DA  lb  A  ->  tm  (all  DA)  ->  tm  (true  A)  . 

7.7,  substitutions  7,7, 

shole  :  tm  (false  A)  ->  tm  (all  •  A) . 
snil  :  tm  (all  •)  . 

sjoin  :  tm  (all  DAi)  ->  tm  (all  DA2)  ->  tm  (all  DAi  ,  DA2)  . 

7,7,  expressions  7.7. 

throw  :  tm  (false  A)  ->  tm  (true  A)  ->  tm  #. 
let  :  tm  (all  DA)  ->  tm  (DA  b  #)  ->  tm  #. 

13  :  i  ->  tm  #. 

7,7,  the  "apply  function"  for  continuations 

7,7,  we  will  give  it  interesting  clauses  later... 

body  :  tm  (false  A)  ->  DA  lb  A  ->  tm  (DA  b  #)  ->  type. 

"/.mode  body  +K  +P  -E. 

"/.worlds  ()  (body  K  P  _)  . 

"/.total  P  (body  K  P  _)  . 

"/.unique  body  +K  +P  -E. 


y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

°/o  OPERATIONAL  SEMANTICS 

0  /  0  f  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of  Of 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 


result  :  type, 
halt  :  i  ->  result . 


load  :  tm  (all  DA)  ->  tm  (DA  b  J)  ->  tm  J  ->  type, 
"/.mode  load  +Scr  +T  -T’. 

ld/tm  :  load  Ser  (A_  T)  T. 

ld/join  :  load  (sjoin  Seri  Ser2)  (A,  T)  T’’ 

<-  load  Ser i  T  T’ 

<-  load  Scr2  T’  T”  . 

ld/con  :  load  (shole  K)  (Aeon  T*)  (T*  K) . 
ld/sub  :  load  Ser  (Asub  T)  (T  Ser)  . 

"/.worlds  ()  (load  _)  . 

"/.total  (Ser)  (load  Ser  _  _)  . 

"/.unique  load  +Ser  +T  -T. 

eval  :  tm  #  ->  result  ->  type. 

"/.mode  eval  +E  -R. 
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ev/load  :  eval  (let  Scr  E)  R 

<-  load  Scr  E  E’ 

<-  eval  E’  R. 

ev/throw  :  eval  (throw  K  (val  P  Scr))  R 

<-  body  K  P  E 
<-  eval  (let  Scr  E)  R. 
ev/15  :  eval  (15  N)  (halt  N)  . 


"/.worlds  ()  (eval  _  _)  . 
"/.covers  eval  +E  -R. 


y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

°/o  EXAMPLE  CONTINUATIONS 
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 


ignore  :  tm  (false  A) . 
ignore/p  :  body  ignore  P 


(A_  15  z)  . 


exit  :  tm  (false  int) . 

exit/n  :  body  exit  (n  N)  (A_  15  N)  . 


"/."/.  Check  the  above  definitions  are  exhaustive... 
"/.total  P  (body  K  P  _)  . 

"/.unique  body  +K  +P  -E. 


iszero  :  tm  #  ->  tm  #  ->  tm  (false  int)  . 

iszero/z  :  body  (iszero  Ez  Enz)  (n  z)  (A_  Ez) . 

iszero/nz  :  body  (iszero  Ez  Enz)  (n  (s  _))  (A_  Enz). 


plus  :  tm  (false  int)  ->  tm  (false  (int  *  int)). 

plus/mn  :  body  (plus  K)  (pair  (n  M)  (n  N))  (A_  throw  K  (val  (n  P)  snil)) 
<-  add  MNP. 


succ  :  tm  (false  nat)  ->  tm  (false  nat) . 

succ/n  :  body  (succ  K)  N  (Asub  [cr]  throw  K  (val  (ss  N)  cr)). 

plus’  :  tm  (false  nat)  ->  tm  (false  (nat  *  nat)). 
plus’/zn  :  body  (plus’  K)  (pair  zz  N)  (A,  A_  Asub  [cr] 
throw  K  (val  N  cr)). 

plus’/sn  :  body  (plus’  K)  (pair  (ss  M)  N)  (Asub  [cr] 

throw  (plus’  (succ  K))  (val  (pair  M  N)  a)). 

addl  :  tm  (false  int)  ->  tm  (false  int) . 

addl/n  :  body  (addl  K)  (n  N)  (A_  throw  K  (val  (n  (s  N))  snil)). 

n2i  :  tm  (false  int)  ->  tm  (false  nat) . 

n2i/zz  :  body  (n2i  K)  zz  (A_  throw  K  (val  (n  z)  snil)). 

n2i/ss  :  body  (n2i  K)  (ss  N)  (Asub  [cr]  throw  (n2i  (addl  K))  (val  N  cr)). 

"/."/.  Check  the  above  definitions  are  exhaustive... 

"/.total  P  (body  K  P  _)  . 

"/.unique  body  +K  +P  -E. 
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r/x/x/x/x/x/x/x/xm 

"/.  EXAMPLE  EVALUATIONS 


°/x/x/x/x/x/x/x/x/x/:/. 


"/.query  1  * 

eval  (throw  (plus  exit) 

(val  (pair  (n  (s  s  z) )  (n  (s  s  z)))  (sjoin  snil  snil))) 
R. 


"/.query  1  * 
eval  (throw 

(plus’  (n2i  exit))  (val  (pair  (ss  ss  zz)  (ss  ss  ss  zz) )  (sjoin  snil  snil))) 
R. 


°/x/x/x/x/x/x/x/x/x/:/. 

%  ENCODING  OF  DIRECT-STYLE 

y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

"/,  We  begin  by  defining  some  continuation  combinators .  .  . 
fst  :  tm  (false  A)  ->  tm  (false  (A  *  B) ) . 

fst/xy  :  body  (fst  K)  (pair  PI  P2)  (A,  Asub  [cr il  Asub  [cr2]  let  a\  E) 

<-  body  K  PI  E. 

snd  :  tm  (false  B)  ->  tm  (false  (A  *  B)). 

snd/xy  :  body  (snd  K)  (pair  PI  P2)  (A,  Asub  la il  Asub  la 2]  let  (72  E) 

<-  body  K  P2  E. 

case  :  tm  (false  A)  ->  tm  (false  B)  ->  tm  (false  (A  +  B)). 
case/inl  :  body  (case  K1  K2)  (ini  P)  El 
<-  body  K1  P  El. 

case/inr  :  body  (case  K1  K2)  (inr  P)  E2 
<-  body  K2  P  E2. 

not  :  (tm  (false  A)  ->  tm  #)  ->  tm  (false  -1  A)  . 
not/k  :  body  (not  K)  hole  (Aeon  [k]  K  k) . 

con  :  ({A}  A  lh  A  ->  tm  (all  A)  ->  tm  #)  ->  tm  (false  A)  . 
con/x  :  body  (con  K)  P  (Asub  [cr]  K  _  P  a)  . 

conV  :  (tm  (true  A)  ->  tm  #)  ->  tm  (false  A) . 
conV/x  :  body  (conV  K)  P  (Asub  [cr]  K  (val  P  a) )  . 

uncurry  :  (tm  (false  B)  ->  tm  (false  A))  ->  tm  (false  (A  *  -1  B)). 

uncurry/pk  :  body  (uncurry  K)  (pair  P  hole)  (A,  Asub  la\ ]  Aeon  [k]  throw  (K  k)  (val  P  cr  1)). 

"/.total  P  (body  K  P  _)  . 

"/.unique  body  +K  +P  -E. 

"/,  Now  we  define  macros  for  programming  in  direct-style... 

"/.abbrev  emp  :  pos  ->  type  =  [A]  tm  (false  A)  ->  tm  #. 

"/.abbrev  Lift  :  tm  (true  A)  ->  emp  A 
=  [x]  [k]  throw  k  x. 
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"/.abbrev 

Pair  :  cmp  A  ->  cmp  B  ->  cmp  (A  *  B) 

=  [el]  [e2]  [k]  el  (con  [_]  [pi]  [crj  e2  (con  [_]  [p2]  [er2] 

throw  k  (val  (pair  pi  p2)  (sjoin  a i  <t2))))- 


"/.abbrev 

Fst  :  cmp  (A  *  B)  ->  cmp  A 
=  [e]  [k]  e  (fst  k)  . 

"/.abbrev 

Snd  :  cmp  (A  *  B)  ->  cmp  B 
=  [e]  [k]  e  (snd  k)  . 

"/.abbrev 

Ini  :  cmp  A  ->  cmp  (A  +  B) 

=  [e]  [k]  e  (con  [_]  [p]  [a  1]  throw  k  (val  (ini  p)  <7i)). 

"/.abbrev 

Inr  :  cmp  B  ->  cmp  (A  +  B) 

=  [e]  [k]  e  (con  [_]  [p]  [<r2]  throw  k  (val  (inr  p)  cr2)). 

"/.abbrev 

Case  :  cmp  (A  +  B)  ->  (tm  (true  A)  ->  cmp  C)  ->  (tm  (true  B)  ->  cmp  C)  ->  cmp  C 
=  [e]  [f]  [g]  [k]  e  (case  (conV  [x]  (f  x)  k)  (conV  [y]  (g  y)  k) ) . 

"/.abbrev 

Fn  :  (tm  (true  A)  ->  cmp  B)  ->  cmp  (A  — >  B) 

=  [f]  Lift  (val  hole  (shole  (uncurry  [k’]  conV  [x]  (f  x)  k’))). 

"/.abbrev 

App  :  cmp  (A  — ■>  B)  ->  cmp  A  ->  cmp  B 
=  [f]  [e]  [k] 

f  (not  [kf] 

e  (con  [_]  [p]  [fJi ]  throw  kf  (val  (pair  p  hole)  (sjoin  a\  (shole  k))))). 


"/.abbrev 
Z  :  cmp  int 

=  Lift  (val  (n  z)  snil)  . 

"/.abbrev 

S  :  cmp  int  ->  cmp  int 
=  [e]  [k]  e  (addl  k)  . 

"/.prefix  9  S. 

"/.abbrev 

Plus  :  cmp  int  ->  cmp  int  ->  cmp  int 

=  [el]  [e2]  [k]  el  (conV  [nl]  e2  (conV  [n2] 

(Pair  (Lift  nl)  (Lift  n2))  (plus  k) ) ) . 

"/.abbrev 

Plus*  :  cmp  int  ->  cmp  int  ->  cmp  int 

=  [el]  [e2]  [k]  el  (con  [_]  [nl]  [ail  e2  (con  [_]  [n2]  [a2l 

throw  (plus  k)  (val  (pair  nl  n2)  (sjoin  a\  <j2)))). 
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"/.abbrev 

AbortO  :  cmp  A 
=  [k]  IS  z. 
"/.abbrev 

Abort  1  :  cmp  A 
=  [k]  15  (s  z)  . 


y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y  y 

/o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o  /o 

"/.  EXAMPLE  EVALUATIONS 

mmmmmmn 

"/.abbrev 

run  :  cmp  int  ->  i  ->  type  =  [t]  [n]  eval  (t  exit)  (halt  n) . 

"/.query  1  * 

run  (Plus  (S  S  Z)  (S  S  S  Z))  N. 

"/.query  1  * 

run  (Plus  (S  S  Z)  Abortl)  N. 

"/.query  1  * 

run  (Plus  AbortO  Abortl)  N. 

"/.query  1  * 

run  (App  (Fn  [x]  Plus  (Lift  x)  (S  Z))  (S  Z))  N. 
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